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Soil  variability  is  a  limiting  factor  in  making 
accurate  predictions  of  soil  performance  at  any  particular 
position  on  the  landscape.     A  large  number  of  studies  have 
been  made  to  quantify  soil  variability,  but  a  large  portion 
of  them  ignored  the  multivariate  character  of  soils  and  the 
geographic  aspect  of  soil  variability.     Data  from  151 
pedons  in  northwest  Florida  were  selected  (i)  to  determine 
the  important  properties  affecting  soil  variability  and 
(ii)  to  evaluate  the  soil  variability  in  the  area  studied 
using  geostatistics . 

Data  were  non-normally  distributed  but  statistical 
techniques  employed  did  not  require  the  assumption  of 
normality.     This  result  could  support  the  presence  of 
systematic  patterns  of  soils. 
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Principal  component  analysis  was  used  to  reduce  the 
number  of  soil  properties  to  study  the  soil  variability. 
Two  sets  of  data  were  used:  weighted  average  values  of  soil 
properties,  and  A  horizon  properties.     Horizon  thickness 
was  used  as  the  weighting  criterion.    Variables  were 
standardized  to  mean  zero  and  variance  one.     Plots  of  soil 
properties  in  the  plane  of  the  principal  components, 
varimax  rotation,  analysis  of  eigenvalues,  eigenvectors, 
and  collinearity ,  and  calculation  of  correlation 
coefficients  between  soil  properties  and  principal 
components  were  used  to  select  important  properties  for 
evaluation  of  soil  variability. 

A  nested  analysis  of  variance  indicated  that 
properties  selected  by  the  principal  component  analysis 
were  differentiating  properties. 

Geostatistical  analysis  was  applied  to  the  properties 
selected.     The  within-soil  series  variance  was  used  as 
criterion  to  assess  stationarity .     Drift  was  present. 
Consequently,  residuals  were  used  to  compute  semi- 
variograms. 

Semi-variograms  of  total  sand  and  clay  contents  showed 
structure.     Nugget  variance  was  present  in  all  semi- 
variograms.     Ranges  varied  from  15  to  35  km.  Soil 
variability  was  direction-dependent.     The  N-S  and  NW-SE 
were  the  directions  of  maximum  variability.     Organic  carbon 
content  had  a  large  point-to-point  variation. 
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All  observed  semi-variograms  had  a  characteristic  wave 
pattern  that  indicated  a  cyclic  variation  of  soil 
properties . 

Kriged  standard  error  diagrams  were  functions  of  the 
nugget  variance  and  showed  areas  where  more  samples  are 
required  to  increase  the  precision  of  estimates. 

Fractal  dimensions  indicated  the  scale-dependent 
character  of  soil  variability. 
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INTRODUCTION 


The  fundamental  purpose  of  a  soil  survey  is  to 
estimate  the  potentials  and  limitations  of  soils  for  many 
specific  uses.     Soil  delineations  are  mapped  to  be  as 
homogeneous  as  possible  in  order  to  correlate  the 
adaptability  of  soils  to  various  crops,  grasses,  and  trees; 
and  to  predict  their  behavior  and  productivity  under 
different  management  practices  (Soil  Survey  Staff,  1951; 
1981) . 

Quality  of  soil  surveys  has  been  improved  over  the 
years  as  a  result  of  improved  understanding  of  soil.  But 
soil  variability  remains  as  one  of  the  main  constraints  to 
reliable  soil  interpretations  and  is  a  limiting  factor  for 
making  accurate  predictions  of  soil  performance  at  any 
particular  position  on  the  landscape. 

The  study  and  understanding  of  soil  variability 
represents  a  cornerstone  for  improving  soil  surveys. 
Belobrov  (1976),  a  Russian  soil  scientist,  pointed  out  that 
"The  degree  of  approximation  between  the  true  and  the 
observed  soil  variability  does  not  depend  on  the  nature  of 
the  soil  cover,  but  mainly  on  the  methods  of  investigation" 
(p.  147). 
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For  several  years,  soil  scientists  used  methods  of 
investigation  which  did  not  consider  the  "real  nature"  of 
soils,  because  they  ignored  the  systematic  variation  of 
soils  on  the  landscape  and  assumed  a  random  variation  of 
soils  in  space.     On  the  other  hand,  despite  the  fact  that 
it  has  been  recognized  that  a  soil  map  unit  is  imperfect  to 
varying  degrees,  depending  on  the  scale  of  the  map  and  the 
nature  of  the  soil  (Soil  Survey  Staff,  1975),  most  soil 
surveys  in  the  U.S. A  have  accepted  an  unrealistic  model  in 
which  map  units  encompass  soil  bodies  that  form  discrete, 
internally  uniform  units,  with  abrupt  boundaries  at  their 
edges  (Hole  and  Campbell,  1985). 

Studies  of  soil  variability  have  not  been  consistent. 
These  studies  have  considered  a  random  variation  of  soils 
and  at  the  same  time  they  have  used  a  limited  number  of 
observations  for  characterizing  map  units  to  establish  the 
range  of  variation  of  observed  properties.     The  assumption 
has  been  that  properties  measured  at  a  point  also  represent 
the  unsampled  neighborhood.     The  extent  to  which  this 
assumption  is  true  depends  on  the  degree  of  spatial 
dependence  among  observations. 

The  number  of  studies  for  quantifying  soil  variability 
has  sharply  increased  in  the  last  10  years,  but 
quantification  still  remains  a  problem.     A  large  proportion 
of  quantitative  studies  are  based  on  untested  assumptions, 
ignore  the  multivariate  character  of  soils,  or  use  a  biased 


selection  of  properties  to  represent  the  soil  variability, 
increasing  the  risk  of  erroneous  conclusions. 

For  these  reasons  a  large  soil  data  base  was  selected 
in  northwest  Florida  with  the  following  objectives:   (i)  to 
discover  which  soil  properties  most  strongly  influence  the 
soil  variability  in  the  area  studied,  and  (ii)  to  study  how 
geostatistics  can  be  used  in  evaluating  soil  variability. 


LITERATURE  REVIEW 


Principal  Component  Analysis 

The  multivariate  character  of  soil  is  well  recognized; 
a  large  set  of  measurements  of  soil  properties 
(morphological,  chemical,  physical,  and  mineralogical )  can 
be  derived  from  a  single  sample.     The  complete  set  of 
available  data  is  not  always  used  for  numerical  analyses. 
Hole  and  Campbell  (1985)  indicated  that  the  selection  of 
soil  properties  depends  on  the  objectives  of  the  study,  and 
also  reflects  the  constraints  imposed  by  cost,  time, 
effort,  and  access. 

There  is  no  doubt  that  logically  correlated  variables, 
such  as  soil  pH  and  base  saturation,  are  generally  so 
highly  covariant  that  one  or  the  other  should  not  be 
included  in  the  analysis.     Particle-size  fractions  (sand, 
silt,  and  clay)  always  add  up  to  100%,  and  therefore,  the 
whole  set  of  particle-size  data  should  not  be  included  in 
the  analysis.     Consequently,  in  the  process  of  selecting 
soil  properties,  there  is  an  important  question  to  be 
answered:  Are  the  selected  soil  properties  the  most 
important  to  represent  the  variability  of  the  complete  set 
of  data? 
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Webster  (1977)  pointed  out  that  when  one  soil  property 
is  measured  in  a  set  of  individual  sampling  units,  the 
measured  values  can  be  represented  by  their  positions  on  a 
single  line.     The  relation  between  any  pair  of  individuals 
can  be  represented  by  the  distance  between  them  and  the 
relations  among  several  individuals  can  be  established 
simultaneously  from  their  relative  positions  on  the  line. 
At  the  same  time,  it  is  almost  impossible  to  visualize 
their  positions  on  the  line  and  the  relations  among  more 
than  two  individuals  simultaneously.     Thus,  he  indicated 
that  an  alternative  way  of  dealing  with  multivariate  data 
is  to  arrange  the  individuals  along  one  or  more  new  axes. 
This  reduction  of  an  arrangement  in  many  dimensions  to  a 
few  dimensions  is  known  as  ordination. 

The  two  most  common  methods  of  ordination  are  Factor 
Analysis  (FA)  and  Principal  Components  Analysis  (PCA). 
Shaw  and  Wheeler  (1985)  said  that  in  both  technigues  new 
variables  are  defined  as  mathematical  transformations  of 
the  original  data.     However,  FA  assumes  that  the  original 
variable  is  influenced  by  various  determinants:  a  part 
shared  by  other  variables ,  known  as  the  common  variance; 
and  a  unigue  variance  which  consists  of  both  a  variance 
accounted  for  by  influences  specific  to  each  variable  and 
also  a  variance  relating  to  measurement  error.  In 
contrast,  PCA  assumes  that  statistical  variation  in  the 
variables  is  explained  by  the  variables  themselves,  in  this 


case  by  the  common  variance.     PCA  is  recommended  when  there 
are  high  correlations  between  variables,  a  large  number  of 
variables,  and  a  need  for  only  simple  data  reduction.  The 
major  objective  in  PCA  is  to  select  a  number  of  components 
that  explain  as  much  of  the  total  variance  as  possible, 
whereas  FA  is  used  to  explain  the  interrelationship  among 
the  original  variables  (Afifi  and  Clark,  1984).     PCA  has 
the  advantage  in  that  the  values  of  principal  components 
are  relatively  simple  to  compute  and  interpret. 

PCA  is  a  method  that  has  been  used  to  reduce  the 
number  of  variables  without  losing  important  information 
(Webster,  1977).     In  general,  the  analysis  finds  the 
principal  axes  of  a  multidimensional  configuration  and 
determines  the  coordinates  of  each  individual  in  the 
population  relative  to  those  axes.     Then,  the  data  can  be 
represented  in  a  few  dimensions  by  projecting  the  points 
orthogonally  on  the  principal  axes. 

The  basic  idea  of  PCA  is  to  create  new  variables 
called  the  principal  components  (PC)   (Afifi  and  Clark, 
1984).     Each  new  variable  is  a  linear  combination  of  the  X 
variables  and  can  therefore  be  written  as 


PC  =  A 
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where  PC  =  principal  component 


coefficient  (eigenvector) 


variable 
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Coefficients  of  these  linear  combinations  are  chosen 
to  satisfy  the  following  requirements: 

(i)  Variance  PC.   >  variance  PC-  >  .   .   .  variance  PC  . 

i  z  n 

(ii)  The  values  of  any  two  PCs  are  uncorrelated. 

(iii)  For  any  PC  the  sum  of  the  squares  of  the  coefficients 
is  one. 

Cuanalo  and  Webster  (1970)  used  PCA  in  a  study  of 
numerical  classification  and  ordination    in  which 
morphological,  physical,  and  chemical  soil  properties  (pH, 
clay,  silt,  fine  sand,  proportion  of  stones,  consistence, 
water  tension,  color,  mottling,  and  peatiness)  were 
measured  at  depths  of  13  cm  and  38  cm  at  85  sites  and 
randomly  sampled  within  physiographic  units  near  Oxford, 
England.     The  variables  were  standardized  to  unit  variance 
and  the  population  was  centered  at  the  origin.     It  was 
found  that  the  first  six  PCs  represented  almost  70%  of  the 
total  variation  presented  in  the  original  data.     The  first 
three  PCs  represented  more  than  50%  of  the  total  variance. 
The  first  component  showed  large  contributions  from  water 
tension,  and  chroma  in  both  the  topsoil  and  the  subsoil. 
In  the  second  component,  contribution  of  fine  sand  in  the 
topsoil  (13  cm)  and  subsoil  (38  cm)  was  dominant.     Hue  and 
value  made  large  contributions  to  the  third  component.  The 
projection  of  the  population  scatter  on  the  plane  defined 
by  the  first  two  PCs  gave  the  most  informative  display  of 
relations  in  the  whole  space.     These  authors  suggested  that 
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when  numerical  data  are  available,  the  data  should  be 
examined  first  by  ordination  procedures;  then,  the  data 
selected  by  the  ordination  procedure  can  be  used  with  a 
numerical  classification  to  decide  if  such  classification 
grouped  data  satisfactorily. 

Norris  (1972)  used  PCA  to  study  trends  in  soil 
variation.     He  described  several  morphological  soil 
properties  (stage  of  organic  matter  decomposition, 
percentage  of  stones,  structure,  consistence,  porosity, 
roots,  biological  activity,  and  color  in  terms  of  presence 
or  absence  of  gley  or  dark  colors)  in  410  pedons,  307 
pedons  located  in  woods  and  103  pedons  located  in  farmland. 
The  first -PC  accounted  for  39%  of  the  total  variance,  and 
corresponded  to  a  trend  from  deep,  stoneless  pedons 
developed  on  a  clayey  formation  to  pedons  developed  on 
shallow  limestone  on  steep  slopes.  The  second  PC  accounted 
for  14%  of  the  total  variance  and  separated  pedons  located 
on  farmland  from  those  located  in  the  woods.     He  concluded 
that  the  PCs  served  as  a  summary  of  soil  variation  in  the 
area,  because  they  accounted  for  a  known  percentage  of  the 
soil  variation  and  were  correctly  defined  in  terms  of  the 
properties  used  to  describe  the  soil. 

Webster  and  Burrough  (1972)  sampled  the  first  two 
horizons  from  84  soil  pedons  and  recorded  selected  soil 
properties  (soil  color,  CaC03  content,  depth  to  CaC03 , 
total  penetrable  soil  depth,  clay  content,  organic  matter 
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content,  cation  exchange  capacity  (CEC),  pH,  exchangeable 
Mg  and  K  contents,  and  available  P  content).     They  used  PCA 
to  reduce  the  dimensionality  of  the  data,  and  found  that 
the  first  two  PCs  accounted  for  55%  of  the  total  variance 
(40%  the  first  component  and  15%  the  second  component). 
Separate  contributions  to  the  components  were  determined  by 
projecting  vectors  on  the  components  axes.  They 
established  that  those  properties  determined  in  the  field 
(CaC03  content,  depth  to  CaC03 ,  clay  content,  and  subsoil 
color)  were  closely  correlated  and  well  represented  in  one 
dimension  in  the  first  component.     The  properties  measured 
in  the  laboratory  (organic  matter  content,  CEC,  and 
exchangeable  Mg  content)  contributed  most  to  the  second 
component,  indicating  differences  in  management  rather  than 
natural  soil  differences.     Results  of  the  numerical 
classification  were  supported  by  showing  the  distribution 
of  sampling  sites  in  space  projected  on  to  the  plane  of  the 
first  two  components  and  showing  the  frequency  distribution 
of  the  first  PC.     There  was  a  good  agreement  among  the 
results.     Therefore,  it  was  concluded  that  when  PCs 
represent  the  variables  that  explain  soil  variation  the 
components  can  be  mapped  as  isarithms  and  the  maps  have 
interpretable  meaning. 

Kyuma  and  Kawaguchi  (1973)  employed  PCA  to  grade  the 
chemical  potentiality  of  41  Malayan  paddy  soil  samples;  23 
physical,  chemical,  and  mineralogical  properties  were 


evaluated.     The  first  four  PCs  accounted  for  75%  of  the 
total  variance.     The  first  PC  was  highly  positively 
correlated  with  electrical  conductivity,  exchangeable  Ca, 
Mg,  Na,  and  K  contents,  moisture,  CEC,  available  Si 
content,  and  0.2  M  HCl-soluble  K.     The  first  PC  was  highly 
negatively  correlated  with  the  kaolin  mineral  content.  All 
of  these  properties  were  relevant  to  the  chemical 
potentiality  of  the  soil,  thus,  the  first  PC  was  called  the 
chemical  potentiality  component.     The  standardized  scores 
of  the  first  PC  were  computed.     These  scores  were  used  for 
grading  soils  in  terms  of  the  chemical  potentiality.  The 
authors  stated  that  the  result  of  grading  was  reasonable. 
Placed  at  the  top  of  the  scale  were  soils  developed  on 
juvenile  marine  sediments.     Soils  having  high  sand  and/or 
kaolin  content  were  at  the  bottom  of  the  scale.  The 
authors  concluded  that  PCA  was  useful  for  comparing  the 
soil  fertility  status  among  soils. 

Burrough  and  Webster  (1976)  used  PCA  with  Similarity 
and  Canonical  Variate  Analyses  to  improve  soil 
classification  in  eastern  Malaysia.     Morphological  and 
chemical  properties  determined  by  routine  analysis  were 
recorded  from  66  randomly  selected  sites.     The  first  nine 
PCs  accounted  for  more  than  70%  of  the  total  variance. 
Scatter  diagrams  of  pairs  of  components  were  drawn  to 
elucidate  the  population  structure.     Established  classes 
that  were  originally  thought  to  be  desirable  overlapped 
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almost  completely  with  respect  to  morphological  and 
chemical  properties.     Dendograms  derived  from  similarity 
analysis  confirmed  the  interpretations  drawn  from  the 
scatter  diagrams. 

Williams  and  Rayner  (1977)  employed  PCA  as  a  method 
for  grouping  soils  based  on  chemical  composition  (Fe,  Ti, 
Ca,  K,  Si,  Al,  P,  Mg,  Mn,  Ni,  Cu,  Zn,  Ga,  As,  Br,  Rb,  Sr, 
Y,  Zr,  and  Pb  total  contents)  and  other  soil  properties 
such  as  particle  size  (sand,  silt,  and  clay),  loss  on 
ignition,  CaC03  content,  pH,  and  soil  moisture.  The 
scatter  diagram  showed  that  the  first  two  components 
divided  the  soils  into  parent  material  groups.  This 
grouping  was  also  supported  by  using  dendograms  derived 
from  similarity  analysis.     It  was  concluded,  on  the  basis 
of  the  PCA,  that  the  soils  sampled  came  from  three  parent 
materials  of  different  ages. 

McBratney  and  Webster  (1981)  studied  the  relationships 
between  sampling  points  using  PCA.     A  substantial 
proportion  (44%)  of  the  total  variance  was  explained  by  the 
first  two  PCs.     The  first  component  represented  color. 
Varimax  rotation  was  employed  to  obtain  a  better 
interpretation  of  the  scatter  diagram  but  it  produced  no 
appreciable  improvement  in  interpretability.     The  scatter 
diagram  of  PC  allowed  the  separation  of  sampled  points  into 
five  different  groups. 
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Richardson  and  Bigler  (1984)  applied  PCA  to  selected 
soil  properties  (clay  content,  pH,  organic  carbon  content, 
CaC03  equivalent,  electrical  conductivity,  and  soluble  Mg, 
Ca,  and  Na  contents)  which  were  meaningful  to  soil 
development  and  plant  growth  in  wetlands  in  North  Dakota. 
Four  routine  measurements  useful  for  characterizing  and 
classifying  wetland  soils  were  identified  by  PCA 
(electrical  conductivity,  organic  carbon  content,  CaC03 
equivalent,  and  clay  content).     Electrical  conductivity  and 
soluble  Mg  and  Na  contents  were  the  most  important 
variables  in  explaining  observable  differences  in  wetland 
soils.     In  addition,  the  use  of  PCA  allowed  the  examination 
of  the  interaction  of  chemical  and  physical  properties  with 
the  landscape  position  of  wetland  soils,  as  well  as  the 
variation  in  properties  among  vegetation  zones,  after  the 
data  were  plotted  in  the  plane  of  the  first  two  PCs. 

Edmonds  et  al.   (1985)  employed  PCA  as  a  first  step  for 
using  Cluster  and  Discriminant  Analyses  to  study  taxonomic 
variation  within  three  soil  map  units.     Forty  different 
soil  properties  were  included  in  the  analyses.  Variables 
with  low  variance  were  excluded  by  the  analysis.     PCA  was 
used  to  reduce  the  number  of  dimensions  needed  to  ordinate 
pedons  in  the  plane  of  PCs  (character  space)  and  to  remove 
intercorrelation  of  soil  properties.     The  use  of  PC  scores 
as  data  for  Cluster  Analysis  avoided  distortions  in 
coordinates  of  the  pedons  in  the  plane  of  PCs.  They 
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compared  the  results  with  the  taxonomic  classification  of 
soils,  and  concluded  that  grouping  of  pedons  by  numerical 
taxonomy  did  not  correspond  to  groupings  by  taxa  in  Soil 
Taxonomy . 

Geostatistics 

Webster  and  Burgess  (1983)  pointed  out  that  to 
describe  soil  variation  two  features  of  soil  must  be  taken 
into  account.     The  first  is  that  long  range  trends  have  no 
simple  mathematical  form;  usually,  there  is  not  any  obvious 
repeating  pattern;  and  the  larger  the  area  or  the  more 
intensive  the  sampling  the  more  complex  the  variation 
appears.     The  second  is  that  the  point-to-point  variation 
in  a  sample  reflects  real  soil  variation.     Only  a  small 
part  is  the  measurement  error.     In  addition,  the  same 
authors  indicated  that  earlier  attempts  to  describe  spatial 
variation  in  geology  and  geography  involved  fitting 
deterministic  global  eguations  to  data,  either  exactly  or 
by  least  squares  approximation.     But  the  two  features 
mentioned  above  make  the  approach  inappropiate  for  soil. 
Thus,  an  alternative  was  to  treat  the  soil  as  a  random 
function  and  to  describe  it  using  geostatistics  techniques. 
Historical  Development 

Etymologically,  the  term  geostatistics  designates  the 
statistical  study  of  natural  phenomena,  and  it  is  defined 
as  the  application  of  the  formalism  of  random  functions  to 
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the  reconnaissance  and  estimation  of  natural  phenomena 
(Journel  and  Huijbregts,  1978). 

Geostatistics  was  primarily  developed  for  the  mining 
industry  (Matheron,  1963).     Geostatistics  was  very  useful 
for  engineers  and  geologists  for  studying  the  spacial 
distribution  of  important  properties  such  as  grade, 
thickness,  or  accumulation  of  mineral  deposits. 

Matheron  (1963)  considered  that,  historically, 
geostatistics  was  as  old  as  mining  itself.     He  indicated 
that  as  soon  as  mining  men  concerned  themselves  with 
foreseeing  results  of  future  work  and,  in  particular,  as 
soon  they  started  to  take  and  to  analyze  samples  and 
compute  mean  values  weighted  by  corresponding  thickness  of 
deposits  and  influence-zones,  geostatistics  was  born. 

Geostatistics  started  in  the  early  1950s  in  South 
Africa  with  D.G.  Krige  (Olea,  1975).     Krige  realized  that 
he  could  not  accurately  estimate  the  gold  content  of  mined 
blocks  without  considering  the  geometrical  setting 
(locations  and  sizes)  of  the  samples.  Matheron  expanded 
Krige 's  empirical  observations  into  a  theory  of  the 
behavior  of  spatially  distributed  variables  which  was 
applicable  to  any  phenomenon  satisfying  certain  basic 
assumptions,  and  the  variables  were  not  limited  by  their 
physical  nature. 
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Theoretical  Bases 

Classical  statistics  could  not  be  used  for  ore 
estimation  because  of  their  inability  to  take  into  account 
the  spatial  aspect  of  the  phenomenon  (Matheron,  1963).  An 
aleatory  variable  had  two  essential  properties:   (i)  the 
possibility,  theoretical  at  least,  of  repeating 
indefinitely  the  test  that  assigned  to  the  variable  a 
numerical  value,  and  (ii)  the  independence  of  each  test 
from  the  previous  and  the  next  tests.     A  given  ore-grade 
within  a  deposit  would  not  have  those  two  properties.  The 
content  of  a  block  of  ore  was  first  of  all  unigue,  but  on 
the  other  hand,  two  neighboring  ore  samples  were  certainly 
not  independent. 

Earth  scientists  usually  deal  with  complex  phenomena 
which  are  the  result  of  the  interaction  of  variables, 
through  relationships  which  are  in  part  unknown  and  in  part 
very  complex  (Olea,  1975).    Variations  are  erratic  and 
often  unpredictable  from  one  point  to  another,  but  there  is 
usually  an  underlying  trend  in  the  fluctuations  which 
precludes  regarding  the  data  as  resulting  from  a  completely 
random  process.     To  characterize  variables  which  are  partly 
stochastic  and  partly  deterministic  in  their  behavior, 
Matheron  (1971)  introduced  the  term  regionalized  variable. 
He  developed  the  regionalized  variable  theory  to  describe 
functions  which  vary  in  space  with  some  continuity. 
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A  regionalized  variable  is  a  continuously  distributed 
variable  having  a  geographic  variation  too  complex  to  be 
represented  by  a  workable  mathematical  function  (Campbell, 
1978).     Although  the  precise  nature  of  the  variation  of  a 
regionalized  variable  is  too  complex  for  a  complete 
description,  the  average  rate  of  change  over  distance  can 
be  estimated  by  the  semi-variance.     Conversely,  Olea  (1977) 
stated  that  a  regionalized  variable  is  a  function  that 
describes  a  natural  phenomenon  which  has  geographic 
distribution. 

The  term  geostatistics  has  come  to  mean  the 
specialized  body  of  statistical  techniques  developed  by 
Matheron  and  associates  to  treat  regionalized  variables 
(Olea,  1984).     The  theory  of  regionalized  variables  has  two 
branches:  the  transitive  methods  and  the  intrinsic  theory 
(Matheron,  1969).     The  first  is  a  highly  geometrical 
abstraction  without  probabilistic  hypothesis  and  has  little 
practical  interest.     The  practical  counterpart  of  those 
geometrical  abstractions  is  the  intrinsic  theory  which  is  a 
term  for  the  application  of  the  theory  of  random  variables 
to  regionalized  variables. 

Matheron  (1969)  and  Olea  (1975)  indicated  that 
regionalized  variables  are  characterized  by  the  following 
properties:   (i)  localization,  a  regionalized  variable  is 
numerically  defined  by  a  value  which  is  associated  with  a 
sample  of  specific  size,  shape,  and  orientation  which  is 
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called  geometrical  support.     (ii)  Continuity,  the  spatial 
variation  of  a  regionalized  variable  may  be  extremely  large 
or  very  small,  depending  on  the  phenomenom  studied,  but 
despite  this  fact,  an  average  continuity  is  generally 
present,  in  some  cases  the  average  continuity  cannot  be 
confirmed,  and  then  a  nugget  effect  is  present, 
(iii)  Anisotropy,  changes  may  be  gradual  in  one  direction 
and  rapid  or  irregular  in  another.     These  changes  are  known 
as  zonalities. 

A  basic  assumption  in  the  intrinsic  theory  is  that  a 
regionalized  variable  is  a  random  variate  (Matheron,  1969). 
The  observed  values  are  outcomes  following  some  probability 
density  function.     Henley  (1981)  considered  that  a 
regionalized  variable  as  a  random  function  which  may  be 
defined  in  terms  of  a  probability  distribution  (i.e.,  it 
may  be  normally  distributed  with  a  particular  mean  and 
variance) . 

Olea  (1984)  indicated  that  a  spatial  function  can 
either  be  described  by  a  mathematical  model  or  given  by  a 
relative  freguency  analysis  based  on  experimentation.  The 
former  approach  is  not  practical  because  of  the  complexity 
of  spatial  functions.     The  latter  is  seriously  limited  by 
the  maximum  number  of  samples  that  can  be  collected. 

Olea  (1975)  stated  that  the  difficulty  of  the  relative 
freguency  approach  with  a  regionalized  variable  is  that  a 
repeated  test  cannot  be  run  because  each  outcome  is  unigue. 
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Since  a  large  number  of  samples  are  essential  to  any 
statistical  inference,  it  is  not  possible  to  determine  the 
probability  density  function  which  rules  the  occurrence  of 
a  regionalized  variable. 

The  impossibility  of  obtaining  the  probability  density 
function  associated  with  the  variable  is  not  a  serious 
limitation.     Most  of  the  properties  of  interest  depend  only 
on  the  structure  of  the  regionalized  variable  as  specified 
by  its  first  and  second  moments  (Olea,  1975).     A  key 
assumption  is  stationarity.     Stationarity  is  a  mathematical 
way  to  introduce  the  restriction  that  the  regionalized 
variable  must  be  homogeneous .     Stationarity  permits 
statistical  inference.     A  test  can  be  repeated  by  assuming 
stationarity  even  though  samples  must  be  collected  at 
different  points.     All  samples  are  assumed  to  be  drawn  from 
populations  having  the  same  moments. 

Several  scientists  have  discussed  the  assumption  of 
stationarity  (Henley,  1981;  Huijbregts,  1975;  Journel  and 
Huijbregts,  1978;  Olea,  1975;  1984;  Tipper,  1979;  Trangmar 
et  al.,  1985;  Webster,  1985).     Geostatistics  invokes  a 
stationary  constraint  called  the  intrinsic  hypothesis  to 
resolve  the  impossibility  of  obtaining  a  probability 
distribution.     A  regionalized  variable  is  called  strictly 
stationary  if  it  is  stationary  for  any  order  k  =  1,  2,  3, 
4,  .  .  .  n.     If  k  is  egual  to  one,  the  regionalized 
variable  has  first-order  stationarity.  Second-order 


stationarity  also  implies  first-order  stationarity. 
Second-order  stationarity  signifies  that  the  first  two 
moments  (covariance  and  variance)  of  the  difference  between 
two  observations  are  independent  of  the  location  and  are  a 
function  only  of  the  distance  between  them. 

In  general,  for  a  regionalized  variable  of  order  k, 
all  the  moments  of  order  k  or  less  are  invariant  under 
translation.     For  a  stationary  variable,  the  covariance  has 
the  following  properties: 

(i)  COV  (0)  >   |COV(X2  -  Xx)|  (2) 

where  COV  =  covariance 

(ii)  LIM  COV(h)  =  0,     h  -*  <*>  (3) 

where  LIM  =  limit 

(iii)  COV(0)   =VAR[Y(X)]  (4) 
where  VAR  =  variance 

(iv)  COV(X2  -  X1)  =  COV(X1  -  x2)  (5) 
These  relations  are  better  visualized  in  Figure  1. 

For  second-order  stationarity,  VAR[Y(X)]  must  be 
finite.     Then,  according  to  equation  (4)  COV(0)  must  be 
finite.     However,  many  phenomena  in  nature  are  subject  to 
unlimited  dispersion  and  cannot  correctly  be  described  when 
they  are  assigned  a  finite  variance.     Thus,  to  avoid  this 
restriction,  the  intrinsic  theory  assumes  what  is  called 
the  intrinsic  hypothesis.     The  intrinsic  hypothesis  is 
satisfied  if,  for  any  displacement  h  ,  the  first  two 
moments  of  the  difference  [Z(x)  -  Z(x  +  h) ]  are  independent 
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of  the  location  x  and  are  a  function  only  of  h: 

E  [Z(x)  -  Z(x  +  h)]  =  M(h)  (First  moment)  (6) 

E  [{  Z(x)  -  Z(x  +  h)  -  M(h)}2]  =  2  G(h)     (Second  moment) 

(7) 

where  M(h)  and  G(h)  are  referred  as  the  drift  and  the  semi- 
variance  or  intrinsic  function,  respectively.     The  semi- 
variance  is  a  measure  of  the  similarity,  on  the  average, 
between  observations  at  a  given  distance  apart.     The  more 
alike  the  observations,  the  smaller  is  the  semi-variance. 

The  semi-variogram  (Olea,  1975;  Journel  and 
Huijbregts,  1978),  which  is  the  plot  of  semi-variance 
against  distance  h  (lag),  has  all  the  structural 
information  needed  about  a  regionalized  variable:   (i)  zone 
of  influence  that  provides  a  precise  meaning  to  the  notion 
of  dependence  between  samples,   (ii)  anisotropy  when 
variability  is  direction-dependent  revealing  the  different 
behavior  of  the  semi-variogram  for  different  directions, 
and  (iii)  continuity  of  the  variable  through  space,  which 
is  indicated  by  the  shape  and  the  particular 
characteristics  of  the  semi-variogram  near  the  origin. 

One  of  the  oldest  methods  of  estimating  space  or  time 
dependency  between  neighboring  observations  is  through 
autocorrelation  (Vieira  et  al.,  1983).     Nash  (1985)  pointed 
out  that  the  correlogram  (plot  of  autocorrelation  against 


distance)  is  the  mirror  image  of  the  semi-variogram. 
Vieira  et  al.   (1983)  indicated  that  when  interpolation 
between  measurements  is  needed,  the  semi-variogram  is  a 
more  adeguate  tool  to  measure  the  correlation  between 
measurements.     An  infinite  dispersion  is  allowed  using 
semi-variances . 

According  to  Journel  and  Huijbregts  (1978)  the 
autocorrelation  is  egual  to 


f(h)  =  C(h)/  C(0)  (8) 
where  f(h)  =  autocorrelation 

C(h)  =  autocovariance  or  covariance  at  distance  h 

C(0)  =  variance 


The  relationship  between  C(h)  and  C(0)  is  expressed  by 
eguation  (4).     When  the  semi-variance  changes,  it  is 
assumed  that  its  variations  are  small  with  respect  to  the 
working  scale.    This  is  a  condition  of  guasi  or  local 
stationarity.    When  the  regionalized  variable  is  weakly 
stationary,  it  also  obeys  the  intrinsic  hypothesis.  The 
semi-variance  is  then  given  by 


G(h)  =  a2  -  C(h)  (9) 
where  G(h)  =  semi-variance 

a2  =  population  variance 
C(h)  =  autocovariance 


The  autocorrelation  and  the  semi-variance  are  related 
by  the  following  equation: 

f(h)  =  1  -  G(h)/  C(0)  (10) 

Burgess  and  Webster  (1980a)  pointed  out  that  the 
autocorrelation  coefficient  depends  on  the  variance 
(equation  8),  and  according  to  equation  (4)  the  variance 
must  be  finite  to  fulfill  the  requirement  of  stationarity . 
It  was  indicated  earlier  that  many  phenomena  in  nature  are 
subject  to  unlimited  dispersion.     The  semi-variance  is  free 
of  this  restriction,  and  consequently  is  preferred.  They 
also  indicated  that  a  second  advantage  of  working  with 
semi-variance  is  that  it  is  easier  to  take  into  account 
local  trends  in  the  property  of  interest.     Residuals  are 
used  when  trends  are  present.     Webster  and  Burgess  (1980) 
demonstrated  that  the  variance  of  the  residuals  from  the 
mean  is  not  equal  to  the  variance  of  the  difference  between 
the  values  when  trends  are  present.  Therefore, 
autocorrelation  is  difficult  to  use. 

Webster  (1985)  classified  the  semi-variograms  into 
four  groups : 

Safe  models.     They  are  defined  for  one  dimension  but 
are  safe  in  the  sense  that  they  are  conditional  positive 
definite  in  two  and  three  dimensions.     These  models  are 
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1.  The  linear  model: 

G(h)  =  cQ  +  wh  for    h  >  0  (11) 

G(0)  =  0  (12) 
where      G  =  semi-variance 

cQ  =  intercept  or  nugget  variance 
w  =  slope 
h  =  lag  distance 
Equation  (11)  assumes  that  h  has  an  exponent  a  =  1.  When 
the  exponent  a  =  0.5  the  model  is  called  root.     When  a  =  2 
the  model  is  parabolic. 

2.  The  spherical  model: 

G(h)  =  cQ  +  w  [1.5  (h/a)  -  0.5  (h/a)3]  (13) 

for     0  <  h  <  a 
G(h)  =  cQ  +  w  for    h  >  a  (14) 

G(0)  =  0  (15) 
where      a  =  range 

cQ  +  w  =  sill 

3.  The  exponential  model: 

G(h)  =  cQ  +  w  [1  -  exp  (-h/a)]  for  h  >  0  (16) 
G(0)  =  0  (17) 

4.  The  DeWijsian  model: 

G(h)  =  c0  +  a  ln(h)  for    h  >  0  (18) 

G(0)  =  0  (19) 

5.  The  Gaussian  model: 

G(h)  =  c0  +  w  (1  -  exp  -(h/a)*)  for    h  >  0  (20) 

G(0)  =  0  (2i) 
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6.     The  hyperbolic  model: 

G(h)  =  h/  a  +  3h  (22) 
where  a  and  p  are  coefficients  of  the  hyperbola 
function. 

Risky  models.     The  semi-variogram  increases  to  a  sill. 

1.  The  circular  model: 

G(h)  =  cQ  +  w  [1  -  2/ti  cos(h/a)  +  2h/  na(l  -  h2/a2)*] 

for    0  <  h  <  a  (23) 
G(h)  =  cQ  +  w  for    h  >  a  (24) 

G(0)  =  0  (25) 

2.  The  linear  model  with  a  sill: 

G(h)  =  cQ  +  w  (h/a)  for     0  <  h  <  a  (26) 

G(h)  =  cQ  +  w  for    h  >  a  (27) 

G(0)  =  0  (28) 

Nested  model.      The  components  of  variance  measure  the 
amount  of  variance  contributed  by  each  scale. 

G(h)  =  i  VAR  [Z(x)  -  Z(x+h)]  =  GQ(h)  +  G1(h)  (29) 
where      GQ(h)  =  pure  nugget  semi-variance 

G^fh)  =  spatially  dependent  semi-variance 

Anisotropic  model.     Variability  is  not  equal  in  all 
lateral  directions. 

G(h,0)  =  cQ  +  u(0)    |h|  (30) 

where 

u(6)  =  [A2  cos2   (9  -  a)  +  B  sin2   (9  -  a)]*  (31) 
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where    9  =  anisotropy  angle 

a  =  direction  of  maximum  variation 

A  =  gradient  of  semi-variogram  in  direction  of 

maximum  variation 
B  =  gradient  in  the  direction  a  +  §  u 
The  most  common  semi-variograms  are  showed  in  Figure  2. 

Computing  a  series  of  semi-variograms  and  deriving  a 
model  from  them  is  usually  not  an  end  in  itself.  The 
objectives  of  geostatistical  studies  are  to  determine  the 
characteristics  of  the  data  and  to  obtain  the  best 
estimates  possible  with  the  available  data.     The  advantage 
of  using  a  geostatistical  approach  is  that  the  computed 
values  are  optimum.     The  error  of  estimation  is  minimized. 
The  acronyn  BLUE  (best  linear  unbiased  estimation)  is 
sometimes  used  to  characterize  this  method  (Green,  1985). 

Estimation  procedures  that  incorporate  regionalized 
variable  theory  were  originally  known  as  kriging,  a  term 
named  for  D.G  Krige  ( DeGraf f enreid,  1982).     Kriging  is  a 
distance-weighted  moving  average  estimation  procedure  that 
uses  the  semi-variogram  to  determine  optimal  weights. 

Kriging  depends  on  computing  an  accurate  semi- 
variogram  from  which  estimates  of  semi-variance  are  then 
used  to  obtain  the  weights  applied  to  the  data  when 
computing  the  averages,  and  are  presented  in  the  kriging 
eguation  (Burgess  and  Webster,  1980a). 
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Gaussian  Linear,  Root,  Parabolic 


Figure  2.     Common  semi-variogram  models. 
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When  values  of  soil  properties  are  averaged  over  point 
values,  which  represent  volumes  with  the  same  size  and 
shapes  as  the  volumes  of  soil  on  which  the  original 
descriptions  were  recorded  (i.e.,  pedons ) ,  the  kriging 
procedure  is  called  punctual  kriging  (Burgess  and  Webster, 
1980a).    When  an  average  is  made  over  areas,  the  procedure 
is  called  block  kriging  (Burgess  and  Webster,  1980b). 
Block  kriging  produces  smaller  estimation  variances  and 
smoother  maps. 

Burgess  and  Webster  (1980a)  and  Webster  and  Burgess 
(1983)  pointed  out  that  kriging  is  a  means  of  spatial 
prediction  that  can  be  used  for  soil  properties.  In 
kriging,  the  weights  take  account  of  the  known  spatial 
dependence  expressed  in  the  semi-variogram  and  the 
geometric  relationships  among  the  observed  points.  Kriging 
is  optimal  in  the  sense  that  it  provides  estimates  of 
values  at  unrecorded  places  without  bias  and  with  minimum 
known  variance. 

It  has  been  indicated  by  several  scientists 
(Huijbregts,  1975;  Olea,  1975;  1984;  Trangmar  et  al.,  1985; 
Webster  and  Burgess,  1980)  that  kriging  is  used  only  with 
regionalized  variables  that  are  first-order  stationary.  For 
variables  whose  drift  is  not  stationary,  but  for  whose 
residuals  the  intrinsic  hypothesis  holds,  universal  kriging 
is  used. 
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Webster  and  Burgess  (1980)  stated  that  universal 
kriging  takes  account  of  local  trends  in  data  when 
minimizing  the  error  associated  with  estimation.  Universal 
kriging  can  be  performed  after  computing  suitable 
expressions  for  the  drift  and  corresponding  semi-variograms 
of  the  residuals. 

Olea  (1984)  said  that  universal  kriging  is  a  linear 
estimator  of  the  regionalized  variable  and  has  the  form 


n 


Z(xQ)  =i£1  ri  Z(XjL)  (32) 
where  Z(xQ)  =  unknown  parameter  at  location  xQ 
r.  =  weights 

Z(xi)  =  value  of  a  property  at  a  point  xi 

Matheron  (1963)  stated  that  suitable  weights  r. 
assigned  to  each  sample  are  determined  by  two  conditions. 
The  first  condition  is  that  Z  (xQ)  and  Zix^  must  have 
the  same  average  value  within  the  area  of  influence,  and  is 
written  as 

ill  ri  =  1  (33) 
The  second  condition  is  that  r.  have  such  values  that 
estimation  variance  (kriging  variance)  of  Z(xQ)  and  Z(xi) 
should  take  the  smallest  possible  value.    The  unknown  I\  1 s 
were  found  by  solution    of    a    system    of    linear  eguations 
which  result  from  forcing  the  unbiased  estimator  to  have 
minimum  variance.     The  eguation  is  as  follows: 


AX  =  B  (34) 
where  A,  B,  and  X  are  given  by  equations  (35),   (36),  and 
(37)   (Figures  3  and  4) . 

In  recent  years  a  new  method  for  estimation  has  been 
developed.    Vieira  et  al.   (1983)  stated  that  in  soil 
science,  agrometeorology,  and  remote  sensing,  very  often 
some  variables  are  cross-related  with  others.     In  addition, 
some  of  those  variables  are  easier  to  measure  than  others. 
In  such  situations  estimation  of  one  variable  using 
information  about  both  itself  and  another  cross-correlated, 
easier-to-measure  variable  should  to  be  more  useful  than 
the  kriging  of  that  variable  by  itself.     This  estimation  is 
easily  made  using  cokriging. 

Cokriging  has  been  defined  as  the  estimation  of  one 
spatially  distributed  variable  from  values  of  another 
related  variable  (DeGraf fenreid,  1982;  Gutjahr,  1984). 

Dependence  between  two  variables  can  be  expressed  by  a 
cross  semi-variogram  (McBratney  and  Webster,  1983a).  For 
any  pair  of  variables  i  and  j  there  is  a  cross  semi- 
variance  G  (h)  at  lag  h^.    defined  as 

Gij(h)  =  E  [{Z±(x)  -  Z^x+h)}  {Zj(x)  -  Z.(x+h)}]  (38) 

where  Z±  and  Z.  are  the  values  of  i  and  j  at  places  x  and 
x+h.     If  i  =  j  then,  equation  (38)  represents  the  auto 
semi-variance. 
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G(x.   ,xk)  =  Semi-variance  between  two  sample  elements 

located  at  a  distance  (x.  ,xv). 

D  X 

f1=  Function  of  x,  derived  from  the  drift. 


Figure  3.     Equation  number  35. 
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Tj  =  weights. 

T(xk,xQ)  =  semi-variance  between  two  sample  elements 
located  at  a  distance  (xk  -  xQ). 

f1(x)  =  function  of  x,  derived  from  the  drift. 

p..  =  Lagrange  multipliers. 


Figure  4.     Eguation  number  36  (a)  and  equation 
number  37  (b) . 
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The  cokriging  equation  is  given  by 
j  nj 

W  =  jll  i£l  rij  z<xij)'  for  a11  3  (39) 
where  i,  j  =  variables 

Z  (Xj)  =  estimated  value  of  variable  j  at  location  xQ 
rij  =  we^9nts 

To  avoid  bias  the  weights  have  to  fulfill  two  conditions: 
nj 

(i)  r±j  =  1  (40) 
and 

nj 

(ii)  i£1  rAj  =  0      for  all  i  not  equal  to  j .  (41) 

The  first  condition,  according  to  McBratney  and 
Webster  (1983a),  implies  that  there  must  be  at  least  one 
observation  of  the  variable  j  for  cokriging  to  be  possible, 
and  as  in  kriging  equation  (34),  cokriging  can  be  expressed 
in  matrix  notation  for  solving  the  unknown  weights. 

Trangmar  et  al.   (1985)  indicated  that  cokriging 
requires  at  least  one  sample  point  of  both  the  primary 
variable  and  covariable  properties  within  the  estimation 
neighborhood.     If  the  primary  variable  and  covariable  are 
present  at  all  sampling  sites  in  the  neighborhood,  then 
cokriging  is  considered  as  an  auto-kriging  of  the  primary 
variable  alone.     In  such  cases,  cokriging  is  unnecessary. 
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Practical  Use 

Earlier  studies  in  soil  science  used  time  series 
analysis  in  which  spatial  dependence  of  soil  properties  was 
considered.     Webster  and  Cuanalo  (1975)  computed 
correlograms  for  clay,  silt,  pH,  CaC03 ,  color-value,  and 
stoniness  for  three  horizons  in  pedons  located  at  10  m 
interval  along  a  transect  in  north  Oxfordshire,  England. 
They  observed  that  the  relation  between  sampling  points 
weakened  steadily  over  distances  from  10  m  to  about  230  m. 
The  average  spacing  between  geological  boundaries  on  the 
transect  was  also  about  230  m.     Outcrop  bedrock  was 
inferred  as  one  of  the  main  sources  of  soil  variation. 
They  concluded  that  mappable  soil  boundaries  were  likely  to 
occur  on  the  average  every  230  m,  and  sampling  at  spacing 
closer  than  115  m  would  be  needed  to  detect  them. 

Lanyon  and  Hall  (1980)  used  morphological,  physical, 
and  chemical  soil  properties  to  test  the  performance  and 
value  of  auto-correlation  analysis.     Spatial  dependence  was 
determined  from  observations  made  every  20  m  along  a 
transect  for  solum  thickness;  fine-earth  fraction  of  the  A, 
B,  and  C  horizons;  and  for  soil  pH,  percent  base  saturation 
(PBS),  and  exchangeable  cations  from  the  deepest  horizon. 
They  found  that  the  range  varied  from  20  m  for  solum 
thickness  and  exchangeable  K  content  to  60  m  for  pH  and 
exchangeable  Mg  content.     They  concluded  that  auto- 
correlation analysis  emphasized  the  continuous,  orderly 
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nature  of  soils,  and  the  fact  that  spatially  related 
observations  may  be  mutually  dependent. 

Campbell  (1978)  was  one  of  the  first  to  use 
geostatistics  in  soil  science.     He  studied  the  spatial 
variation  of  sand  and  pH  measurements  employing  the  semi- 
variance.     Samples  were  collected  at  10  m  intervals  on  two 
sampling  grids  positioned  on  two  contiguous  delineations  in 
eastern  Kansas.     There  was  a  contrast  in  spatial  variation 
of  sand  content  within  the  two  delineations.     Distances  of 
30  and  40  m  were  sufficient  to  encounter  full  variation  of 
sand  content.     Soil  pH  had  a  random  variation  within  both 
areas.     It  was  concluded  that  the  most  important 
application  of  semi-variograms  was  in  determining  optimum 
sample  spacing  in  the  design  of  efficient  sampling 
strategies. 

Gambolati  and  Volpi  (1979)  introduced  the 
determination  of  the  trend  a  priori,  and  improved  the 
process  of  fitting  the  observed  to  a  theoretical  semi- 
variogram.     They  used  kriging  to  describe  ground-water  flow 
near  Venice,  Italy.     They  proposed  and  used  a  modification 
of  the  kriging  technique  developed  by  Matheron  (1970)  which 
aimed  at  improving  the  accuracy  of  the  interpolation 
procedure.     In  Matheron' s  (1970)  basic  theory,  the  trend 
was  not  assessed  a  priori.     The  trend  was  considered  as  a 
linear  combination  of  functions  with  unknown  coefficients. 
Gambolati  and  Volpi  (1979)  considered  the  trend  a  priori; 
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therefore  the  trend  had  to  be  determined.     Also,  they 
defined  the  concept  of  theoretical  consistency  in  kriging 
applications.     Theoretical  consistency  was  derived  from  the 
validation  of  the  interpretation  model.     Validation  was 
made  by  suppressing  each  observation  point  one  at  a  time, 
by  providing  an  estimate  in  that  point  using  the  remaining 
(n-1)  observations,  and  analyzing  the  distribution  of 
errors.     They  stated  that  consistency  occurred  when  there 
was  no  systematic  error  (kriged  average  error  was 
approximately  zero)  and  the  standard  deviation  was 
consistent  with  the  corresponding  error  (the  average  ratio 
of  theoretical  to  calculated  variance  was  approximately 
equal  to  one).     They  found  that  validation  of  the 
interpretation  models  selected  for  study  showed  that  their 
approach  yielded  accurate  results,  provided  the  trend  was 
correctly  assessed. 

Chirlin  and  Dagan  (1980)  modeled  water  flow  through 
two-dimensional  porous  formations  as  a  random  process  using 
an  approximate  formulation  of  flow  physics  to  obtain  an 
expression  for  the  Head  variogram.     The  Head  variogram 
proved  markedly  anisotropic,  with  heads  differing  more 
widely  on  average  for  a  fixed  lag  parallel  to  the  head 
gradient  than  perpendicular  to  it.     Also  they  examined  a 
hypothetical  case  ignoring  anisotropy.     it    was  determined 
from  their  experiment  that  the  kriged  standard  deviation 
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was  overestimated  perpendicular  to  the  mean  flow  and  was 
underestimated  parallel  to  it. 

Hajrasuliha  et  al.   (1980)  studied  salinity  levels  in 
three  different  fields  in  southwest  Iran  which  were 
initially  sampled  on  an  arbitrarily  selected  grid  of  80  m. 
Semi-variances  were  calculated  for  all  three  sites  to 
determine  the  degree  of  dependence  between  observations. 
The  results  from  two  fields  showed  that  observations  were 
spatially  dependent.     Contour  lines  of  iso-salinity  were 
obtained  by  using  kriging.     In  the  third  field  salinity 
observations  were  found  to  be  spatially  independent.  Thus, 
the  number  of  samples  necessary  to  get  fiducial  limits  and 
to  identify  the  number  of  samples  to  be  taken  randomly 
across  the  field  for  a  given  probability  were  obtained  by 
using  classical  statistical  methods. 

Luxmoore  et  al.   (1981)  used  semi-variograms  to 
characterize  spatial  variability  of  infiltration  rates  into 
a  weathered  shale  subsoil.     Infiltration  rates  were 
measured  using  double-ring  inf iltrometers  installed  at  48 
locations  on  a  2  x  2  m  grid  after  the  removal  of  1  to  2  m 
of  soil.     A  high  degree  of  variability  in  infiltration 
rates  was  determined.     The  test  for  spatial  patterning 
using  the  semi-variogram  approach  proved  negative. 
Therefore,  they  concluded  that  if  patterning  existed  at 
all,  it  occurred  on  a  spatial  scale  less  than  the  2  m  used 
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in  the  study.  As  a  result  of  this  study,  it  was  determined 
that  infiltration  rate  was  a  randomly  distributed  property. 

Vieira  et  al.   (1981)  analyzed  the  spatial  variability 
of  1280  field-measured  infiltration  rates  on  Typic 
Xerorthents.     The  measurements  were  made  at  the  nodes  of  an 
irregular  grid.     The  semi-variogram  showed  a  range  of  50  m. 
It  was  considered  that,  on  the  average,  samples  separated 
by  50  m  or  more  were  not  correlated  to  each  other. 
Conversely,  they  examined  the  effect  of  the  neighborhood 
size  on  the  value  kriged  and  its  estimation  variance.  They 
determined  that  a  neighborhood  of  14  m  was  sufficient  for 
the  infiltration  data.     The  estimation  variances  changed 
very  little  for  larger  distances.     Low  mean  estimation 
error,  low  variance,  and  high  correlation  coefficient 
showed  that  the  kriging  estimation  was  exceptionally  good. 
Finally,  it  was  determined  that  geostatistics  was  useful  to 
redesign  the  sampling  scheme.     The  large  number  of  measured 
values  made  it  possible  to  calculate  the  minimum  number  of 
samples  necessary  to  reproduce  the  infiltration  rate 
measurements  with  good  precision.     It  was  determined  that 
128  samples  were  enough  to  obtain  nearly  the  same 
information  as  with  1280  samples. 

Geostatistics  was  used  for  first  time  to  study  soil 
variability  of  large  areas  in  Kigali,  Rwanda  by  Vander  Zaag 
et  al.   (1981).     They  studied  the  spatial  variability  of 
selected  soil  properties  (pH;  exchangeable  Ca,  Mg,  K,  and 


Na  contents;  KCl-extractable  Al    content;  percent  Al- 
saturation;  effective  CEC;  ug  P-sorbed  at  an  equilibrium  P 
concentration  of  0.02  and  0.2  ug/g;  extractable  P  content; 
P  and  Si  in  the  saturation  extract;  total  N,  NO3 ,  and  NH4; 
and  extractable  S  contents)  in  the  whole  country  of  Rwanda. 
Semi-variograms  of  soil  pH,  exchangeable  Ca  content, 
effective  CEC,  Si  in  the  saturation  extract,  and 
extractable  NH4  content  showed  long  range  spatial 
dependence.     The  spatial  dependence  varied  from  37.5  km  for 
soil  pH  to  more  than  60  km  for  extractable  NH4.  The 
information  contained  in  the  semi-variogram  was  used  to 
estimate  values  of  soil  properties  at  unsampled  locations 
within  the  range  of  the  semi-variogram.     Maps  of  estimation 
variance  of  kriged  values  were  also  generated.     Such  maps 
showed  that  estimation  variance  of  kriged  values  generally 
increased  with  increasing  distance  from  sample  points.  It 
was  indicated  that  geostatistics  could  be  used  to  make 
quick,  low  cost  assessments  of  soil  variability  of  large 
land  areas.     In  addition,  the  map  of  estimation  variance 
gave  an  indication  of  the  confidence  limits  of  the 
estimated  values.     The  map  can  be  used  to  locate  optimum 
sampling  sites  to  lower  the  estimation  variance. 

McBratney  and  Webster  (1981)  computed  semi-variograms 
of  subsoil  properties  (depth  to  subsoil,  soil  color, 
particle-size,  mineralogy,  organic  carbon  content  ,  total  N 
content,  ratio  OC/ total  N,  and  pH) .     Samples  were  taken  on 
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a  transect  at  20  m  intervals.     Semi-variograms  showed 
spatial  dependence  extending  to  about  360  m  for  some 
properties,  in  particular  color  and  pH.     Other  subsoil 
properties  had  little  or  no  spatial  dependence,  notably 
particle-size  fractions  and  organic  carbon  content.  The 
shape  of  some  semi-variograms  indicated  presence  of 
different  map  units  on  the  transect. 

Van  Kuilenburg  et  al.   (1982)  applied  three 
interpolation  technigues  (proximal,  weighted  average,  and 
kriging)  to  point  data  involving  soil  moisture  supply 
capacity  on  a  2  x  2  km  grid  of  cover  sand  in  the  eastern 
part  of  the  Netherlands.     Survey  points  used  for 
interpolation  were  randomly  stratified  with  an  average 
density  of  1.5  per  ha.     The  root  mean  sguared  error  was 
used  as  a  measure  of  efficiency.     The  root  mean  sguared 
error  was  large  for  the  proximal  method  (less  efficient) 
and  there  was  a  negligible  difference  between  root  mean 
sguared  errors  for  weighted  average  and  kriging.  Weighted 
average  had  the  weakness  that  possible  clusters  of  survey 
points  were  weighted  too  heavily.     This  was  avoided  in 
kriging.     Therefore,  kriging  proved  to  be  the  most 
efficient  for  the  survey  method  used. 

Yost  et  al.   (1982a)  collected  samples  from  80  sites  at 
1  to  2  km  intervals  in  Hawaii.     Soil  samples  were  taken 
from  0  to  15  cm  (topsoil)  and  30  to  45  cm  depths  (subsoil). 
The  former  depth  represented  the  nutrient  status  as 


influenced  by  management  and  the  latter  depth  represented 
the  natural  conditions.     Semi-variograms  for  soil  pH, 
exchangeable  cations  (Ca,  Mg,  K,  Na) ,  sum  of  cations,  P 
reguirements ,  Si  and  P  in  saturation  extract,  extractable  P 
content,  and  rainfall  were  calculated.     Ranges  were  much 
greater  for  soil  properties  in  the  0  to  15  cm  depth  than 
for  those  in  the  30  to  45  cm  depth.     Semi-variograms  for 
Ca,  Mg,  K,  and  P  contents  based  on  the  30  to  45  cm  depth 
samples  demonstrated  greater  variability  and  had  smaller 
ranges  (Ca,  Mg,  and  K)  than  those  based  on  the  0  to  15  cm, 
or  were  extremely  variable  (P).     Si  in  saturation  extract 
had  the  same  range  in  the  subsoil  as  in  the  topsoil. 
Subsoil  properties  were  highly  variable.     Thus,  soil 
management  and  rainfall  imposed  a  degree  of  uniformity  on 
the  surface  soil  properties  not  apparent  in  the  subsoil. 
Yost  et  al.   (1982a)  concluded  that  soil  chemical  properties 
had  spatial  dependence  and  that  understanding  such 
structure  may  provide  new  insights  into  soil  behavior  over 
the  landscape.     The  semi-variograms  changed  at  large 
distances.     These  changes  suggested  that  soils  should  be 
grouped  to  obtain  uniform  regions  of  soil  properties 
suitable  for  management  regimes. 

Yost  et  al.   (1982b)  used  soil  data  from  transects  in 
Hawaii  for  estimating  soil  P  sorption  over  the  entire 
island  by  using  kriging.     The  necessity  of  considering  non- 
stationarity  and  the  use  of  universal  kriging  were 


evaluated.     Universal  kriging,  either  by  prior  polynomial 
trend  removal  or  by  local  polynomial  trend  removal  during 
estimation,  was  not  beneficial  in  spite  of  widely  varying  P 
sorption  and  a  significant  polynomial  trend  in  the  data. 
The  kriged  estimates  indicated  that  P  sorption  properties 
of  soil  obtained  from  transects  could  be  estimated  in  an 
optimal  way  and  could  be  displayed  in  a  manner  to  better 
understand  the  soil  properties  and  genesis,  and  for 
practical  purposes,  estimating  the  fertilizer  needs  and 
distribution  facilities. 

McBratney  et  al.   (1982)  sampled  3500  sites  to  study 
the  spatial  variability  of  Cu  and  Co  soluble  in  mild 
extractants  measured  to  identify  places  where  these  metals 
were  deficient  for  animals.     Semi-variograms  for  both  Cu 
and  Co  were  isotropic  and  appeared  to  combine  three 
components  of  variation:  a  short  range  component  extending 
up  to  3  km,  a  long  range  or  geological  component  extending 
to  15  km,  and  a  non-spatial  or  nugget  component,  which 
accounted  for  32%  and  63%  of  the  total  variance  of  Cu  and 
Co,  respectively.     Cu  showed  a  greater  degree  of  spatial 
dependence  than  Co.     In  addition,  isarithmic  maps 
identified  areas  where  Cu  and/or  Co  were  deficient.  An 
error  map  showed  that  precision  was  generally  acceptable. 
Also,  the  map  identified  a  few  areas  in  the  region  in  which 
sampling  was  too  sparse  for  confidence. 
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Byers  and  Stephens  (1983)  sampled  an  untilled  medium- 
grained  fluvial  sand  in  horizontal  and  vertical  transects 
to  study  the  spatial  structure  of  particle  size  and 
saturated  hydraulic  conductivity.     Semi-variogram  and 
kriging  analyses  indicated  that  both  hydraulic  conductivity 
and  particle  size  were  relatively  isotropic  in  the 
horizontal  plane  but  had  marked  anisotropy  in  the  vertical 
plane.     There  were  marked  similarities  in  spatial  structure 
in  the  horizontal  plane.     The  spatial  distribution  of 
saturated  hydraulic  conductivity  in  the  horizontal  plane 
was  estimated  reasonably  well  using  an  empirical 
relationship  between  particle  size  and  conductivity  along 
with  kriged  estimates  of  the  10%  finer  particle  size. 

Ten  Berge  et  al.   (1983)  studied  the  spatial 
distribution  of  selected  soil  properties  (moisture  content, 
moisture  tension,  bulk  density,  texture,  temperature,  and 
equivalent  surface  temperature).     Two  transects  were 
sampled  at  4  m  intervals.     Semi-variograms  for  moisture 
content  and  bulk  density  did  not  show  any  range  but  only  a 
nugget  effect.     For  other  soil  properties  semi-variograms 
had  a  range  varying  between  80  and  more  than  120  m  (texture 
and  temperature).     Gradual  changes  in  soil  characteristics 
were  expected.     The  presence  of  abrupt  map  unit  boundaries 
was  determined  for  some  properties  (e.g.,  texture).  The 
spatial  structure  of  the  field  moisture  content  was  found 
only  at  very  shallow  depths.     Texture  introduced 


44 

differences  in  hydraulic  conductivity,  which  were  thought 
to  cause  differences  in  topsoil  moisture  content. 

Vauclin  et  al.   (1983)  used  geostatistics  for  studying 
the  variability  of  particle-size  data,  available  water 
content,  and  water  stored  at  1/3  bar.     The  soil  samples 
were  collected  within  a  70  x  40  m  area  at  the  nodes  of  a  10 
m  square  grid.     All  semi-variograms  had  a  nugget  effect 
which  corresponded  to  the  variability  that  occurred  within 
distances  shorter  than  the  sampling  interval  and  to 
experimental  uncertainties.     The  range  varied  from  26  m  for 
water  stored  at  1/3  bar  to  50  m  for  silt  content.  Cross 
semi-variograms  were  calculated  demonstrating  that 
available  water  content  at  1/3  bar  was  correlated  with  sand 
content  within  distances  of  43.5  and  30  m.  Semi-variograms 
and  cross  semi-variograms  were  used  to  krige  and  cokrige 
additional  values  of  available  water  content  and  water 
stored  at  1/3  bar  every  5  m.     They  indicated  that  the  use 
of  cokriging  was  a  promising  tool  whether  the  principal 
objective  was  the  reduction  of  the  estimated  variance 
compared  with  kriging  or  the  need  to  estimate  an  under- 
sampled  variable  by  taking  into  account  its  spatial 
correlation  with  another  more  sampled  variable. 

Spatial  variability  of  nitrates  in  cotton  petioles 
was  determined  employing  semi-variograms  and  kriging  (Tabor 
et  al.,  1984).     Sampling  of  petioles  was  of  two  types,  on 
transects  and  from  randomly  selected  sites  on  a  rectangular 


grid.     Nitrates  in  petioles  showed  definite  spatial 
dependence  in  the  field  studied.     However,  for  sampling 
areas  of  <  1  m,  spatial  dependence  was  insignificant 
compared  to  the  inherent  variability  of  the  sample  and 
laboratory  analyses.  Semi-variograms  and  kriged  maps  of 
nitrates  in  petioles  suggested  a  strong  influence  of  the 
cultural  practices  such  as  direction  of  rows  and 
irrigation. 

Bos  et  al.   (1984)  sampled  in  a  rectangular  grid  50  x 
200  m  at  10  m  intervals  on  sandtailings  capped  with  0  to 
>  2  m  of  strip-mine  overburden.     This  was  done  to  present 
and  discuss  the  use  of  semi-variograms  to  study  the  spatial 
variation  of  extractable  P,  Na,  K,  Mg,  and  Ca  contents, 
extractable  acidity,  CEC,  total  P  content,  pH-water,  pH- 
KC1,  and  soluble  salts  of  the  topsoil  (0  to  25  cm)  and 
relative  elevation  in  reclaimed  Florida  phosphate  mine 
lands.     Semi-variograms  were  calculated  for  data  taken 
along  transects  in  four  different  sampling  directions  and  a 
combined  direction.     Some  properties  (CEC  and  relative 
elevation)  did  not  present  structure  of  spatial  variation. 
The  range  was  approximately  6  m  for  the  combined  and  E-W 
semi-variograms.     Also,  a  nugget  effect  was  observed  which 
represented  variability  at  distances  <  10  m.     Presence  of 
anisotropy  could  not  be  established  because  well-defined 
sills  and  ranges  could  not  be  determined  for  directions  N- 
S,  NE-SW,  and  NW-SE.     The  semi-variograms  were  supported  by 


too  few  data  points  at  large  distances.     It  was  concluded 
that  semi-variograras  were  useful  in  determining  the  spatial 
variability  of  soil  properties  on  reclaimed  phosphate  mine 
lands  and  in  improving  sampling  design  for  liming  and 
fertilization  needs. 

Xu  and  Webster  (1984)  used  geostatistics  to  test  how 
these  technigues  could  be  applied  for  large  areas.  Topsoil 
of  102  pedons  evenly  distributed  throughout  the  studied 
area  in  China  were  sampled.     Soil  pH-water,  organic  matter, 
sand,  total  N,  total  P,  and  total  K  contents  were  measured. 
Variation  of  soil  properties  appeared  to  be  isotropic. 
Soil  pH  showed  the  strongest  spatial  dependence. 
Isarithmic  mapping  of  local  estimates  of  pH  showed  zones  of 
alkaline  soils.     Because  sampling  was  sparse,  on  average 
one  sample  for  3.5  km2,  the  estimation  errors  were  large. 
It  was  suggested  that  a  more  intensive  sampling  scheme 
would  increase  confidence  in  the  maps.     This  would  also 
improve  the  estimation  of  semi-variograms ,  especially  for 
lags  in  the  range  of  0.5  to  5  km. 

Saddig  et  al.   (1985)  collected  data  on  soil  water 
tension  from  99  tensiometers  along  a  76  m  row  planted  with 
chile  pepper  and  irrigated  through  trickle  tubing  placed  5 
cm  below  the  soil  surface.     Semi-variograms  indicated  a 
large  variability  and  little  spatial  dependence  in  soil 
water  tension.     The  range  was  <  6  cm.     Also,  it  was 
determined  that  variability  and  spatial  dependence  were 
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functions  of  the  method  and  timing  of  water  application  and 
the  magnitude  of  the  soil  water  tension.    When  water  was 
applied  through  a  trickle  line,  variability  was  greatest 
and  spatial  dependence  was  smallest.    Variability  was  low 
and  spatial  dependence  high  after  rain  or  extensive 
flooding. 

Rogowski  et  al.   (1985)  were  probably  the  first  to  use 
geostatistics  to  estimate  erosion  at  different  scales. 
Erosion  was  measured  at  nodes  of  three  different  size 
grids:  225  measurements  from  a  15  x  22.5  km  grid,  25  from  a 
5  x  7.5  km  grid,  and  150  from  a  1  x  1.5  km  grid  in  west 
central  Pennsylvania.     Erosion  at  each  node  was  computed 
using  the  universal  soil  loss  eguation.     Kriging  was 
employed  to  map  potential  erosion.     It  was  determined  that 
the  large  grid  sampling  size  smoothed  out  the  variability 
by  assumming  that  a  fixed  slope  length  and  gradient  were 
applicable  to  the  entire  area.     It  was  concluded  that 
estimation  of  erosion  on  a  1  ha  basis  (small  grid)  would 
likely  lead  to  the  optimum  prediction  capability.  This 
conclusion  was  based  primarily  on  the  results  of  structural 
analysis  of  soil  loss  data  which  suggested  a  workable 
continuity  range  of  about  0.1  km  for  an  exponential  semi- 
variogram  model.     The  relative  dispersion  was  about  the 
same  for  the  smaller  and  the  larger  areas. 

Jim  Yeh  et  al.   (1986)  measured  soil  water  pressure 
with  94  tensiometers  permanently  installed  at  3  m  intervals 
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along  a  290  m  transect  at  a  0.3  m  depth  in  New  Mexico. 
Observations  showed  a  gradual  increase  of  soil  water 
pressure  over  time  and  a  high  degree  of  spatial 
variability.    Variations  were  spatially  correlated  over 
distances  at  least  6  m  and  they  were  dependent  upon  their 
mean  value.     These  data  supported  the  hypothesis  obtained 
from  stochastic  analysis  that  the  variation  of  soil  water 
pressure  was  mean-dependent. 

Phillips  (1986)  applied  geostatistics  to  determine  the 
spatial  structure  of  the  pattern  of  variability  of  shore 
erosion  to  identify  the  important  scale  of  variation. 
Shoreline  erosion  was  measured  in  terms  of  recession  rates 
from  two  sets  of  aerial  photographs  taken  in  1940  and  1978. 
Statistical  analysis  indicated  that  variability  of  erosion 
rates  was  high.     The  complex  alongshore  pattern  and  the 
scale  of  local  variability  indicated  that,  despite 
significant  long-range  differences  in  erosion  rates,  short- 
range,  local  factors  were  more  important  in  determining 
differences  in  erosion  rates.     It  was  also  concluded  that 
two  major  factors  accounted  for  alongshore  differences  in 
erosion  rates.     These  were  (i)  a  complex  pattern  of 
differential  resistance  related  to  marsh  fringe  morphology 
and  (ii)  a  crenulated,  irregular  shoreline  configuration 
affecting  exposure  to  wave  energy. 

Several  scientists  (Burgess  et  al.,  1981;  McBratney 
and  Webster,  1983a,  1983b;  Webster  and  Burgess,  1984; 
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Webster  and  Nortcliff ,  1984;  Russo,  1984)  have  used 
geostatistics  for  improving  sampling  technigues.  The 
classical  statistical  approach  for  sampling  soils  does  not 
take  account  of  the  spatial  dependence  among  the  data 
within  one  class.     Therefore,  it  leads  to  conservative 
estimates  of  precision,  with  over sampling  and  unnecessary 
cost  resulting  (Burgess  et  al.,  1981).     Burgess  et  al. 
(1981)  presented  a  sampling  strategy  that  depended  on 
accurately  determining  the  semi-variogram  of  the  property, 
and  then  estimation  variances  could  be  calculated  for  any 
combination  of  block  size  and  sampling  density  by  kriging. 
By  this  sampling  method,  the  sampling  density  needed  to 
attain  a  predetermined  precision  could  be  obtained,  and  the 
sampling  effort  needed  to  achieve  the  precision  desired  was 
at  a  minimum. 

McBratney  and  Webster  (1983b)  stated  that  the  number 
of  observations  needed  to  achieve  a  particular  acceptable 
error  depends  on  the  variation  of  the  property  in  the 
region  concerned.     The  assumptions  of  classical  statistics 
have  reguired  more  observations  than  investigators  could 
afford  to  attain  the  desired  precision.     These  authors  used 
a  method  for  determining  the  sample  size  that  depended  on 
knowing  the  semi-variogram  of  the  property  of  interest. 
The  semi-variogram  information  was  used  in  kriging  for 
estimation  of  variance  in  the  neighborhood  of  each 
observation  point.     Variances  were  pooled  to  form  the 
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global  variance  from  which  a  standard  error  could  be 
calculated.     The  pooled  value  was  minimized  for  a  given 
sample  size  if  all  neighborhoods  were  of  the  same  size. 
Therefore,  the  sampling  size  reguired  to  determine  the 
semi-variogram  would  be  a  major  part  of  the  task.     So,  if 
the  semi-variogram  had  not  been  known,  then  the  best 
strategy  was  to  sample  on  a  regular  grid,  with  the  interval 
determined  by  the  number  of  observations  that  could  be 
reasonably  obtained. 

McBratney  and  Webster  (1983a)  extended  the  sampling 
principle  for  each  variable  to  two  or  more  co-regionalized 
variables.     The  choice  of  the  strategy  was  complicated 
because  not  only  did  the  sampling  intensities  of  the  main 
variable  and  subsidiary  variables  differ  but  also  their 
relative  sampling  intensities  could  be  changed. 
Conversely,  maximum  kriging  variance  did  not  necessarily 
occur  at  the  center  of  the  sampling  configuration  as  it  did 
with  a  single  variable.     It  was  stated  that  in  attempting 
to  find  an  optimal  strategy,  the  maximum  kriging  variance 
must  be  found  by  first  calculating  the  variance  for  a  range 
of  sampled  spacings  and  relative  sampling  intensities. 
Those  that  matched  the  maximum  tolerable  variance  were 
potentially  useful.     It  was  suggested  that  the  optimum 
scheme  was  the  one  that  achieved  the  desired  precision  for 
least  cost. 
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Webster  and  Burgess  (1984)  described  optimal 
rectangular  grid  sampling  configurations  by  which 
estimation  variance  could  be  minimized.     The  geostatistical 
approach  had  the  advantage  that  standard  errors  would  be 
much  smaller  than  with  the  classical  approach.     It  was 
stated  that  even  when  standard  errors  were  estimated 
properly  by  taking  into  account  known  spatial  dependence, 
the  cost  of  making  the  desired  number  of  measurements  in  a 
region  might  still  be  prohibitive.     Under  those 
circumstances  weighting  might  provide  a  feasible  way  of 
overcoming  this  difficulty.     The  aim  of  weighting  was  to 
reduce  the  effort  of  measuring  soil  properties  within 
regions  while  maintaining  the  precision  of  replicated 
observations.     It  was  concluded  that  the  most  serious 
obstacle  to  using  optimal  sampling  strategies  for  single 
estimates  was  the  need  to  know  the  semi-variogram  in 
advance.     The  main  task  was  the  number  of  samples  needed  to 
determine  the  semi-variogram. 

Webster  and  Nortcliff  (1984)  used  measured  values  of 
extractable  Fe,  Mn,  Cu,  and  Zn  contents  to  calculate  the 
sampling  effort  reguired  to  estimate  mean  values  with 
specified  precision.     Semi-variograms  showed  that  there  was 
a  substantial  dependence  for  Fe  and  Mn  contents,  less  for 
Zn  content,  and  even  less  for  Cu  content.  Estimation 
variances  generated  by  classical  methods  and  geostatistics 
were  compared.     The  largest  nugget  variance  in  relation  to 
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the  total  variance  in  the  sample  was  for  Cu.  Classical 
statistics  slightly  exaggerated  the  estimation  variance  for 
Cu.     The  over-estimate  was  more  serious  for  Zn,  Mn,  and  Fe. 
However,  the  major  disadvantage  is  having  to  sample 
intensively  to  obtain  the  semi-variogram. 

Russo  (1984)  proposed  a  method  to  design  an  optimal 
sampling  network  for  semi-variogram  estimation.     The  method 
required  an  initial  sampling  network.     The  location  of 
points  could  be  either  systematically  or  randomly  selected. 
For  a  given  sample  size  (n)  and  using  a  constant  number  of 
pairs  of  points  for  each  lag  class,  the  sampling  network 
criterion  for  selecting  the  location  of  sampling  points  was 
the  uniformity  of  the  values  of  the  separating  lag  distance 
(h)  within  a  given  lag  class,  for  each  of  the  lag  classes 
which  covered  the  area  of  interest  in  the  field.  The 
method  provided  a  set  of  scaling  factors  which  were  used  to 
calculate  the  new  locations  of  the  sampling  points  by  an 
iterative  procedure.     Using  the  aforementioned  criterion, 
the  best  set  of  sampling  points  was  selected.     Analysis  of 
results  indicated  that  by  using  the  proposed  method  the 
variability  within  and  among  lag  classes  was  considerably 
reduced  relative  to  the  situation  where  the  original 
locations  were  used.     In  addition,  sampling  points 
generated  by  the  method  proposed  fitted  the  theoretical 
semi-variograms  better  than  those  which  were  estimated  from 


53 

data  generated  on  the  original  coordinates  of  sampling 
points . 

Fractals 

It  has  been  widely  recognized  that  the  perception  of 
soil  variation  is  a  function  of  the  scale  of  observation. 
Fridland  (1976)  was  one  of  the  first  soil  scientists  to 
recognize  that  a  series  of  randomly  operating  but 
interacting  spatial  processes  at  different  scales  could  be 
combined  to  give  definite  soil  patterns. 

Beckett  and  Bie  (1976)  indicated  that  the  variance  of 
the  values  of  any  soil  property  within  a  given  area  is  the 
sum  of  all  contributions  to  the  soil  variability  within  the 
area.    Thus,  the  overall  variance  within  an  area  of  100  m2 
contains  contributions  from  the  average  variability  within 
areas  of  1  m2 ,  and  from  that  between  areas  of  1  m2  within 
areas  of  5m2,  and  between  areas  of  5  m2  within  areas  of  10 
m2,  and  between  areas  of  10  m2  within  areas  of  100  m2 .  The 
partition  of  the  total  variance  can  be  performed  for  any 
number  of  stages. 

Wilding  and  Drees  (1978)  pointed  out  that  the  nature 
of  soil  variability  is  dependent  on  the  scale  of 
resolution.    They  indicated  that  at  a  low  resolution  level 
(for  example,  looking  at  the  earth  from  the  moon)  spatial 
diversity  may  be  seen  as  land  vs  water.    With  greater 
resolution,  spatial  variabilty  can  be  recognized 
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microscopically  and  submicroscopically  in  the  systematic 
organization  of  biological,  chemical,  and  mineralogical 
composition  of  hand  specimens  representative  of  given 
horizons . 

Burrough  (1983a)  stated  that  each  cause  of  soil 
variation  may  not  only  operate  independently  or  in 
combination  with  other  factors,  but  also  over  a  wide  range 
of  scales. 

Soil  variation  has  been  considered  to  be  the  result  of 
a  systematic  and  a  random  components  (Fridland,  1976; 
Wilding  and  Drees,  1978;  1983).     The  former  is  related  to 
features  such  as  landform,  geomorphic  elements,  and  soil 
forming  factors.     The  latter  corresponds  to  those  changes 
in  soil  properties  that  are  not  related  to  a  known  cause. 
Random  variability  is  unresolved. 

Burrough  (1983b)  indicated  that  the  distinction 
between  systematic  variation  and  noise  (random  variation) 
is  entirely  scale  dependent  because  increasing  the  scale  of 
observation  almost  always  reveals  structure  in  the  random 
component.     He  also  stated  that  making  allowances  for  the 
artifices  of  map  making,  several  conclusions  can  be  drawn: 
(i)  pattern  structures,  and  therefore  spatial  correlations, 
have  been  recognized  at  all  scales;   (ii)  the  detail 
resolved  is  partly  the  result  of  the  scale  of  variation 
present  and  partly  due  to  the  resolving  power  of  the  map  at 
the  given  scale;   (iii)  the  intricacy  of  the  drawn 
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boundaries  is  not  related  to  scale;  and  (iv)  a  feature 
regarded  as  random  at  one  scale  can  be  revealed  as 
structure  at  a  larger  scale.     Also,  Burrough  (1983b) 
pointed  out  that  in  any  given  spatial  study  there  may  be 
many  sources  and  scales  of  variability  present.  The 
sources  and  scales  of  variability  come  into  play 
simultaneously  and  affect  observations  over  all  distances 
between  the  resolution  of  the  sampling  device  and  the 
largest  inter-sample  distance.     Therefore,  it  is  necessary 
to  find  a  substitute  for  the  noise  concept  that  takes  into 
account  the  nested,  autocorrelated,  and  scale  dependent 
character  of  unresolved  variations.     Burrough  (1983a; 
1983b;  1983c)  suggested  that  the  concepts  embodied  in 
fractals  appear  to  offer  a  solution. 

The  term  fractal  was  introduced  by  Mandelbrot  (1977) 
specifically  for  temporal  and  spatial  phenomena  that  were 
continuous  but  not  dif f erentiable ,  and  exhibited  partial 
correlations  over  many  scales.     A  continuous  series,  such 
as  a  polynomial,  is  dif f erentiable  because  it  can  be  split 
into  an  infinite  number  of  absolutely  smooth  straight 
lines.     A  non-diff erentiable  continuous  series  cannot  be 
solved.     Every  attempt  to  split  a  non-diff erentiable 
continuous  series  into  smaller  parts  results  in  the 
resolution  of  still  more  structure  or  roughness.  Fractal 
etymologically  has  the  same  root  as  fraction  and  fragment 
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and  means  "irregular  or  fragmented."    It  also  means  "to 
break." 

Fractals  have  two  important  characteristics  (Burrough, 
1983b).     They  embody  the  idea  of  "self -similarity , "  that 
is,  the  manner  in  which  variations  at  one  scale  are 
repeated  at  another,  and  the  concept  of  a  fractional 
dimension.     The  concept  of  fractional  dimension  is  the 
source  of  the  name  "fractal." 

Mandelbrot  (1977)  defined  a  fractal  curve  as  one  where 
the  Hausdorf f-Besicovitch  dimension  (D)  strictly  exceeds 
the  topological  dimension.     The  simplest  example  is  a 
continuous  linear  series  such  as  a  polynomial  which  tends 
to  look  more  and  more  like  a  straight  line  as  the  scale  at 
which  it  is  examined  increases.     The  D  value  is  calculated 
using  the  following  equation: 

D  =  log  N/log  r  (42) 
where    D  =  Hausdorf f-Besicovitch  dimension 

N  =  number  of  steps  used  to  measure  a  pattern 

r  =  scale  ratio 

Burrough  (1983a)  pointed  out  that  for  a  linear  fractal 
curve,  D  may  vary  between  1  (completely  dif ferentiable)  and 
2  (noisy).     The  corresponding  range  for  D  lies  between  2 
(absolutely  smooth)  and  3  (infinitely  crumpled)  for 
surfaces.     It  is  implicit  in  the  concept  of  fractal  that 
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when  fractals  are  examined  at  increasingly  large  scales 
increasing  amounts  of  detail  are  revealed,  while  at  the 
same  time  vestiges  of  variations  persist  on  the  smaller 
scale. 

Mandelbrot  (1977)  developed  the  fractal  theory  based 
on  the  physical  Brownian  motion.     Burrough  (1983b,  1983c) 
extended  the  fractal  theory  to  soils  using  Brownian  and 
non-Brownian  fractal  models  and  indicated  that  soil  data 
were  fractals  because  increasing  the  scale  of  mapping 
continued  to  reveal  more  and  more  detail.     Soil  data  were 
not  "ideal"  fractals  because  the  data  did  not  possess  the 
property  of  self -similarity  at  all  scales.     Pure  fractals 
are  theoretically  infinitely  nested  structures  with 
infinite  variance. 

Burrough  (1981,  1983a)  demonstrated  that  the  double 
logarithmic  plot  of  a  semi-variogram  of  a  series  which  can 
be  represented  by  a  fractional  Brownian  function  was  a 
straight  line  of  slope: 


m  =  4  -  2  D  (43) 
where  m  =  slope. 

D  =  Hausdorf f -Besicovitch  dimension. 


Therefore,  semi-variograms  are  also  useful  in 
computing  the  fractal  dimension,  but  despite  this  fact, 
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fractals  have  been  not  used  by  many  scientists,  especially 
soil  scientists. 

Burrough  (1981)  computed  D  from  semi-variograms  of 
different  soil  properties.     D  values  varied  between  1.1  and 
1.9.     Low  values  indicated  a  predominance  of  a  systematic 
variation  in  soil  properties  studied.     Large  values 
indicated  a  random  variation  of  soil  properties.     Most  of 
the  fractal  values  were  between  1.5  and  1.9.     Fractals  were 
also  useful  in  revealing  short-  and  long-range  variation 
when  the  D  dimension  was  used  along  the  semi-variogram 
range.     Low  values  of  D  indicated  domination  of  long-range 
variation. 

Fractals  have  been  also  applied  to  erosion  studies. 
Phillips  (1986)  studying  shoreline  erosion  used  the 
methodology  proposed  by  Burrough  (1981,  1983b).  He 
calculated  a  D  value  of  1.91.     This  value  indicated  a  very 
complex,  irregular  pattern  of  erosion  which  was 
statistically  random.     It  also  indicated  a  pattern 
dominated  by  short-range,  local  controls  which  completely 
obscured  any  long-range  trends  that  may  have  existed. 
A  negative  correlation  between  adjacent  sites  was  also 
found.     Phillips  (1986)  concluded  that  the  complex 
landscape  revealed  by  the  analysis  was  probably  related  to 
the  dynamic  nature  of  estuaries  and  coastal  wetlands  and 
the  variety  of  geomorphic,  ecological,  and  human  factors 
that  influenced  marsh  and  shoreline  development. 


DESCRIPTION  OF  STUDY  AREA 
Location 

The  area  studied  is  located  in  northwest  Florida.  It 
extends  from  Santa  Rosa  County  on  the  west  to  Madison 
County  on  the  east,  and  comprises  most  of  the  Florida 
Panhandle  (Figure  5). 

Physiography,  Relief,  and  Drainage 
The  study  area  lies  in  the  Coastal  Plain  Province 
(Duffee  et  al.,  1979,  1984;  Sanders,  1981;  Sullivan  et  al., 
1975;  Weeks  et  al. ,  1980).     The  landscape  is  largely  the 
product  of  streams  and  waves  acting  upon  the  land  surface 
over  the  past  10  to  15  million  years  (Fernald  and  Patton, 
1984) . 

The  major  physiographic  divisions  in  the  area  are  the 
Northern  Highlands  and  the  Marianna  Lowlands.  They  comprise 
the  Southern  Pine  Hills,  the  Dougherty  Karst,  the  Tifton 
Uplands,  the  Apalachicola  Delta,  and  the  Ocala  Uplift 
physiographic  districs.     Elevations  in  the  Northern 
Highlands  range  from  16  or  less  to  114  m  above  sea  level. 
Several  stream  systems  have  produced  a  significant 
erosional  feature  called  the  Marianna  Lowlands,  which 
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Figure  5. 


Location  of  the  counties  from  which 
characterization  data  were  available 
for  pedons  selected  for  study. 
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interrupts  the  continuous  span  of  the  highlands  across 
northwest  Florida.     Elevations  in  the  Marianna  Lowlands 
range  from  20  to  80  m  above  sea  level  (Brooks,  1981a; 
Fernald  and  Patton,  1984). 

Topography  varies  from  nearly  level  to  gently 
undulating,  with  slopes  ranging  from  0  to  35%. 
Commonly  the  gentle  slopes  terminate  in  sinks  or  shallow 
depressions . 

The  drainage  system  is  well  organized  in  streams  that 
flow  southward  from  Alabama  and  Georgia.     The  Chattahoochee 
and  Flint  Rivers  combine  to  form  the  Apalachicola  River, 
the  largest  in  this  southward- flowing  group  of  rivers. 
Some  of  the  drainage  is  disjointed  particularly  in  the 
karst  topography  of  the  Marianna  Lowlands  (Fernald,  1981). 

Geology 

Soils  are  mainly  underlied  by  the  Citronelle 
Formation,  the  Crystal  River  Formation,  and  by 
undifferentiated  Miocene  and  Oligocene  sediments  (Fernald, 
1981) . 

The  Citronelle  Formation  is  composed  of  sand,  gravels, 
and  clays  of  Pliocene-age.     The  Crystal  River  Formation 
comprises  shallow  marine  limestone  of  Eocene-age.  Miocene 
and  Oligocene  sediments  are  mainly  composed  of  "silty" 
sand,  clay,  dolomitic  limestone,  and  f ossilif erous  shallow 
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marine  limestone.     Some  of  the  materials  are  part  of  the 
Marianna  Limestone  Formation. 

Climate 

The  climate  of  the  area  is  controlled  by  latitude  and 
proximity  to  the  Gulf  of  Mexico.     The  area  studied  is 
characterized  by  long,  warm  summers  and  short,  mild  winters 
(Bradley,  1972).    Maximum  and  minimum  temperatures  are 
affected  by  breezes  coming  from  the  Gulf  of  Mexico. 

The  average  annual  temperature  is  approximately  21s  C. 
Maxima  of  about  38q  c  occur  in  June  to  August  and  minima  of 
about  -10q  C  occur  in  January  and  February.     The  average 
growing  season  is  approximately  275  days. 

The  average  annual  rainfall  ranges  between  1400  and 
1660  mm.     Approximately  50%  of  the  average  rainfall  falls 
during  a  4-month  rainy  season  from  June  to  September.  A 
second  period  of  relatively  high  rainfall  occurs  in  the 
late  winter  and  early  spring.     Frequently,  a  short  drought 
during  the  late  spring  causes  considerable  moisture  stress 
to  trees,  crops,  and  grasses. 

Land  Use  and  Vegetation 
The  area  studied  has  a  considerable  extension  of  prime 
farmland  that  is  adequate  for  producing  crops  and  to 
sustain  high  yields  under  conditions  of  high  levels  of 
management  (Caldwell,  1980).     Most  of  the  acreage  is  used 
for  urbanization,  field  crops,  pasture,  and  forestry.  The 
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most  common  crops  are  corn  ( Zea  mays ) ,  soybean  (Glycine 
max),  peanuts  (Arachis  hypogaea ) ,  watermelon  ( Citrullus 
vulgaris ) ,  tobacco  (Nicotiana  spp) ,  and  assorted 
vegetables.     Livestock  operations  are  also  common. 

A  large  part  of  the  area  is  also  covered  by  forest. 
Well  drained  areas  are  characterized  by  the  presence  of 
slash  pine  (Pinus  ellioti  var  ellioti  Engelm. ) ,  black  jack 
oak  (Quercus  marilandica  Munch. ) ,  turkey  oak  ( Quercus 
laevis  Walt),  blue jack  oak  (Quercus  incana  Bartr.),  long 
leaf  pine  (Pinus  palustris  Mill),  and  laurel  oak  (Quercus 
hemiphaerica  Bartr.).     The  poorly  drained  areas, 
corresponding  to  shallow,  densely  wooded  swamps,  and  river 
valley  lowlands,  are  characterized  by  the  presence  of  saw 
palmetto  ( Serenoa  repens  Bartr.),  sweet  gum  ( Liquidamber 
styracif lua  L. ) ,  and  cypress  ( Cupressus  sp.  L. )   (Duffee  et 
al.,  1979,  1984;  Sanders,  1981;  Sullivan  et  al.,  1975; 
Weeks  et  al. ,  1980) . 


Soils 

Soils  in  the  area  studied  have  developed  from  medium- 
textured  marine  sediments.     These  coastal  plain  materials 
were  transported  from  uplands  farther  north  during 
interglacial  periods  when  the  present  land  areas  were 
inundated  by  water  from  the  Gulf  of  Mexico.     Most  of  the 
soils  in  the  study  area  are  characterized  by  a  low  level  of 
natural  fertility  and  are  susceptible  to  erosion  (Duffee  et 
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al.,  1979,  1981;  Sanders,  1981;  Sullivan  et  al.,  1975; 
Weeks  et  al. ,  1980 ) . 

Approximately  83%  of  the  soils  are  classified  as 
Ultisols  (Table  1).     Complete  taxonomic  classification  is 
presented  in  Appendix  A.     In  general,  the  Typic  Hapludults; 
and  the  Typic,  Aquic,  Plinthic,  and  Rhodic  Paleudults  are 
well  and  moderately  well-drained,  with  moderate  to  low 
available  water  capacity  and  with  moderate  to  moderately 
slow  permeability.     These  soils  are  acidic,  low  in  organic 
matter  and  nutrient  contents.     In  gently  sloping  areas, 
limitations  are  moderate  for  cultivate  crops  due  to  the 
erosion  hazard. 

Arenic  Hapludults;  Arenic,  Grossarenic,  Arenic 
Plinthic,  and  Grossarenic  Plinthic  Paleudults;  and  Typic 
Quartz ipsamments  commonly  are  well  to  excessively  drained. 
Permeability  varies  from  rapid  to  moderately  rapid,  and 
available  water  capacity  is  low  to  very  low.  Droughtness 
and  low  water  retention  capacity  are  among  the  principal 
limitations  for  cropping  on  these  soils. 

Typic  Fluvaguents;  Typic  Humaquepts;  Typic 
Ochraqualfs;  Ultic  Haplaquods;  Typic,  Arenic,  Grossarenic, 
Aerie,  Plinthic,  Umbric,  and  Arenic  Umbric  Paleaquults; 
Typic  Albaguults;  and  Typic  and  Aerie  Ochraquults  are 
typically  poorly  drained.     Permeability  varies  from 
moderate  to  slow.     Excessive  wetness  and  flooding  are  among 
the  most  important  limitations  for  growing  crops. 
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Table  1.     Order,  Great  Group,  and  relative  proportion  of 
pedons  studied. 


Order 

Great  Group 

Number  of  pedons 
studied 

% 

Alf isols 

Hapludalf s 

2 

1.3 

Ochragualf s 

2 

1.3 

Entisols 

Quartz ipsamments 

5 

3.3 

Others 

2 

1.3 

Inceptisols 

Dystrochrepts 

1 

0.7 

Humaquepts 

1 

0.7 

Spodosols 

Haplaquods 

2 

1.3 

Ultisols 

Hapludults 

10 

6.6 

Paleudults 

97 

64.5 

Paleaquults 

15 

9.9 

Others 

3 

2.0 

Non-designated 
series  * 

11 

7.1 

TOTAL 

151 

100.0 

These  pedons  have  not  been  classified. 


MATERIALS  AND  METHODS 


Data  Source 

Data  from  151  pedons  (Calhoun  et  al.,1974;  Carlisle  et 
al.,  1978,  1981,  1985;  I.F.A.S.  Soil  Characterization 
Laboratory,  unpublished  data)  were  used  for  the  study.  In 
total,  20  soil  properties  were  selected  (horizon  thickness; 
very  coarse,  coarse,  medium,  fine,  and  very  fine  sand 
fractions;  total  sand,  silt,  and  clay  contents;  pH-water; 
pH-KCl;  organic  carbon  content;  Ca,  Mg,  Na,  and  K  contents 
extractable  in  NH4OAC;  total  bases;  extr actable  acidity; 
CEC;  and  base  saturation).     The  criterion  for  selection  was 
that  these  soil  properties  had  to  have  been  measured  for 
each  horizon  of  the  pedon.     The  number  of  horizons  per 
pedon  varied  between  4  and  7  horizons.     There  were  19,820 
observations . 

Pedon  location,  description,  and  sampling  were  done  by 
soil  scientists  from  U.S.D.A.  Soil  Conservation  Service  and 
the  I.F.A.S.  Soil  Science  Department.     Physical  and 
chemical  analyses  of  the  soils  were  made  by  the  personnel 
of  the  Soil  Characterization  Laboratory  of  the  University 
of  Florida,  Gainesville.     Procedures  used  for  sampling  and 
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chemical  and  physical  analysis  were  outlined  by  Calhoun  et 
al.   (1974)  and  by  Carlisle  et  al.   (1978,  1981,  1985). 

Approximately  half  of  the  data  was  already  stored  in 
an  IBM  XT  microcomputer  using  the  database  management 
software  KeepIT  (ITsoftware,  1984).     It  was  necessary  to 
input  approximately  half  of  the  data  to  complete  the  set  of 
observations  for  this  study. 

Location  of  Pedons 

The  pedons  selected  for  study  were  located  for  soil 
survey  purposes  using  the  system  of  Ranges  and  Townships 
with  the  Tallahassee  Meridian  and  Base  Line  as  reference. 
The  program  used  for  spatial  analysis  requires  the  location 
of  pedons  expressed  by  geographic  coordinates  (Xs  and  Ys). 
Therefore,  each  pedon  was  located  on  topographic  maps  at 
1:24,000  scale  according  to  the  system  of  Ranges  and 
Townships,  and  each  location  was  transformed  into  cartesian 
coordinates  (longitude  and  latitude).     Elevation  above  sea 
level  was  also  recorded. 

The  map  of  physiographic  regions  of  Florida  (Brooks, 
1981b)  at  the  1:500,000  scale  was  used  as  a  base  map  to 
locate  the  entire  set  of  pedons.     Using  as  a  reference  the 
point  30q  00*   00''  N  and  87q  24'  18"  W  (X  =  0  and  Y  =  0), 
X  and  Y  coordinates  were  determined.     This  reference  point 
was  used  to  allow  only  positive  Xs  and  Ys  in  the  studied 
area. 
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Pedon  locations  were  plotted  using  the  POST  command  of 
Surface  II  software  (Sampson,  1978). 

Statistical  Analyses 

Statistical  analyses  were  performed  using  an  IBM  XT 
microcomputer  and  IFAS-VAX  and  NERDC  main  frame  computers. 
Transfer  of  data  between  microcomputer  and  main  frame 
computers  was  possible  by  using  the  public  domain 
communication  programs  Kermit  (to  link  with  IFAS-VAX)  and 
YT  (to  link  with  CMS -NERDC ) . 

Statistical  Analysis  System  software  (SAS  Institute 
Inc,  1982a,  1982b)  was  used  for  the  normality  and  principal 
component  analyses  and  for  plotting  purposes.     The  Fortran 
program  written  by  Skrivan  and  Karlinger  (1979)  was  used 
for  the  geostatistical  analysis.     Surface  II  software 
(Sampson,  1978)  was  employed  to  generate  isarithmic 
(contour)  maps  and  surface  diagrams. 
Normality  Analysis 

The  UNIVARIATE  procedure  (SAS  Institute  Inc.,  1982a) 
was  used  to  test  normality.     This  test  was  mainly  based  on 
the  study  of  skewness,  kurtosis,  the  Kolmogorov  test,  and 
cumulative  plots. 

The  NORMAL  option  was  employed  to  compute  a  test 
statistic  for  the  hypothesis  that  the  input  data  had  a 
normal  distribution.     The  Kolmogorov  D  statistic  was 
computed  because  the  sample  size  was  greater  than  50. 
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The  PLOT  option  was  used  to  plot  the  data.     The  CHART 
procedure  was  employed  to  obtain  histograms  of  the  data. 
Principal  Component  Analysis 

The  PRINCOMP  procedure  (SAS  Institute  Inc.,  1982b)  was 
employed  for  the  PCA.     Because  the  soil  properties  studied 
had  different  measurement  scales,  there  was  a  risk  of 
having  heterogeneous  variances.     An  important  assumption  in 
this  analysis  is  the  homogeneity  of  variances  (Afifi  and 
Clark,  1984).     Therefore,  soil  properties  were  standardized 
to  mean  egual  to  0  and  variance  equal  to  1.     As  a  result 
the  PCs  were  derived  from  the  correlation  matrix  instead  of 
the  covariance  matrix.     Eigenvalues  (variances)  and 
eigenvectors  (coefficients)  of  PCs  were  obtained  by  using 
the  PRINCOMP  procedure. 

The  number  of  PCs  was  selected  by  using  a  rule  of 
thumb  (Afifi  and  Clark,  1984,  p.  322)  that  the  PCs  selected 
are  those  that  explain  at  least  100/P  percent  of  the  total 
variance  where  P  is  the  number  of  variables.     The  PCs 
selected  had  an  eigenvalue  that  represented  >  5%  of  the 
total  variance.     Eigenvectors  for  each  PC  were  selected  on 
the  basis  that  they  had  a  value  larger  than  the  value 
calculated  using  the  following  equation: 

Sc  =  0.5/  (PC  eigenvalue)^  (44) 
where  Sc  =  Selection  criterion 
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The  PLOT  procedure  (SAS  Institute  Inc.,  1982a)  was 
employed  to  plot  eigenvectors.     The  larger  the  value  and 
the  closer  the  eigenvector  to  the  PC  axis,  the  larger  the 
contribution  of  the  variable  to  the  total  variance.  A 
varimax  rotation  (orthogonal  rotation  of  axes)  was  used 
because  some  of  the  eigenvectors  did  not  show  a  clear 
contribution  to  a  particular  PC. 

The  FACTOR  procedure  (SAS  Institute  Inc.,  1982b)  was 
employed  for  the  varimax  rotation  and  to  plot  the  rotated 
eigenvectors . 

Each  PC  is  a  linear  combination  of  standardized 
variables  having  the  eigenvectors  as  coefficients.     Due  to 
this  fact,  collinearity  between  variables  can  be  a  problem. 
It  has  been  reported  (SAS  Institute  Inc.,  1982b)  that  use 
of  highly  correlated  variables  produces  estimates  with  high 
standard  errors.     These  estimates  are  very  sensitive  to 
slight  changes  in  the  data. 

The  REG  procedure  (SAS  Institute  Inc.,  1982b)  with  the 
option  COLLIN  was  used  for  the  analysis  of  collinearity. 
Variables  with  a  tolerance  lower  than  0.01  were  not 
considered  in  the  analysis  (Afifi  and  Clark,  1984). 
Tolerance  is  defined  as: 


T  =  1  -  R 

where  T  =  tolerance 

R  =  coefficient  of  multiple  correlation 


(45) 


Finally,  the  correlation  coefficient  between  the  PCs 
and  the  soil  properties  was  computed  using  the  equation: 

rij  =  aij   (VAR  PC)*  (46) 
where    r^j  =  correlation  coefficient 

a^j  =  eigenvector 

VAR  PC  =  PC  eigenvalue 

Soil  properties    selected  for  further  study  were  those 
having  a  high  (>|0.75|)     correlation  coefficient. 
Geostatistical  Analysis 

A  Fortran  program  written  by  Skrivan  and  Karlinger 
(1979)  was  employed.     The  geostatistical  analysis  had  four 
parts . 

Semi-variograms .     The  X,  Y,  and  Z  (soil  property) 
values  were  used  as  input  in  this  step.     Before  a  valid 
semi-variogram  can  be  calculated,  the  drift,  if  present, 
must  be  removed,  otherwise  the  stationarity  assumption  is 
not  fulfilled. 

Journel  and  Huijbregts  (1978)  stated  the  criterion  to 
consider  when  the  drift  is  absent.     They  indicated  that, 
considering  the  semi-variogram  as  a  positive  definite 
function,  an  experimental  semi-variogram  with  an  increase 
smaller  than  |h|2  (where  h  =  modulus  of  the  lag  distance) 
for  large  distances  h  is  incompatible  with  the  intrinsic 
hypothesis.     Such  an  increase  in  the  semi-variogram  most 
often  indicates  the  presence  of  a  trend  or  drift.  However, 
drift  can  be  determined  if  the  semi-variogram  has  already 
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been  calculated.     Thus,  an  iterative  process  (trial  and 
error)  was  followed  to  calculate  the  semi-variogram. 

An  observed  semi-variogram  based  on  the  data  was 
calculated.     If  drift  was  present,  then  the  information 
contained  in  the  observed  semi-variogram  was  used  to 
calculate  the  drift  coefficients  and  residuals  of  the 
observations  relative  to  the  drift  function.     Then,  a  new 
semi-variogram  from  the  residuals  could  be  calculated. 
This  process  was  repeated  until  drift  was  removed  or  a 
satisfactory  semi-variogram  was  obtained. 

Five  semi-variograms  were  calculated  for  each 
variable:  direction-independent  and  direction-dependent 
(N-S,  E-W,  NE-SW,  NW-SE).     The  semi-variogram  plots  were 
obtained  by  using  the  Energraphics  software  (Enertronics, 
1983)  . 

Fitting  semi-variograms.     In  this  step  the  structural 
information  (range,  lag  distance,  and  slope)  was  used  to 
adjust  the  parameters  in  the  semi-variogram  until  the  model 
was  theoretical  consistent  (Gambolatti  and  Volpi,  1979). 
Consistency  occurred  when  the  kriged  average  error  (KAE) 
was  approximately  zero  and  the  average  ratio  of  theoretical 
to  calculate  variance,  called  reduced  mean  square  error 
(RMSE)  was  approximately  equal  to  one.     These  parameters 
are  represented  by  the  following  equations: 


n 

(i)        KAE  =  1/n  ifi1(Zi  -  Z±) 


(47) 
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where    n  =  number  of  points 


=  measured  value 


Z^  =  kriged  value 


n 


(ii)     RMSE  =  1/n  i£1(Zi  -  Z±)2/  a2  (48) 

where  a2  =  calculated  variance  and  is  equal  to 

n-1  n  n-1 

a2  =  k(0)  -iZ1ri  c(h)-i£1ui  M(h)+is1ri2  SL2  (49) 

where    K(0)  =  sill 

I\  =  unknown  weighting  coefficient 

C(h)  =  covariance  based  on  semi-variance  and 
sill 

=  unknown  LaGragian  multiplier 
M(h)  =  drift 

S^2  =  variance  of  the  measurement  error 

The  fitting  procedure  was  based  on  the  jackknife 
method  developed  by  Tukey  (Sokal  and  Rohlf,  1981)  which  is 
a  useful  technique  for  analyzing  statistics  if 
distributional  assumptions  are  of  concern. 

The  procedure  was  to  split  the  observed  data  into 
groups  (usually  of  size  one)  and  to  compute  values  of  the 
statistic  with  a  different  group  of  observations  being 
ignored  each  time.     The  average  of  these  estimates  was  used 
to  reduce  the  bias  in  the  statistic.     The  variability  among 
these  values  was  used  to  estimate  the  standard  error. 
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Gambolati  and  Volpi  (1979)  extended  the  use  of  this 
technique  to  geostatistics . 

Kriging.     Universal  kriging  was  the  method  used  in 
this  investigation.     Universal  kriging  takes  into  account 
local  trends  in  data,  minimizing  the  error  associated  with 
estimation.     The  kriged  Z  value  for  X  and  Y  location  and 
its  associated  variance  were  computed. 

The  kriged  Z  values  and  associated  standard  errors 
were  the  inputs  to  the  Surface  II  software  to  produce 
isoline  maps  of  the  different  values  and  associated 
variances . 

Fractals.     Statistical  Analysis  System  (SAS  Institute 
Inc.,  1982a,  1982b)  was  employed  for  transforming  semi- 
variance  and  lag  distance  values  into  logarithmic  values. 

The  REG  procedure  was  used  to  obtain  the  slope  of  the 
line.  The  Hausdorf f -Besicovitch  dimension  was  computed  by 
using  equation  (43). 

Finally,  this  dissertation  was  written  using 
WordPerfect  software  (SSI  Software,  1985). 


RESULTS  AND  DISCUSSION 


Test  of  Normality 
The  assumption  of  normality  is  important  for  most 
statistical  analyses.     Mean  and  standard  deviation  are 
needed  to  characterize  completely  the  distribution  of 
values  if  the  data  are  normally  distributed.    When  data  are 
normally  distributed,  approximately  95%  of  the  values  fall 
within  two  standard  deviations  of  the  mean  (Montgomery, 
1976;  Snedecor  and  Cochran,  1980;  SAS  Institute  Inc., 
1982a) . 

Gower  (1966),  however,  pointed  out  that  in  PCA,  unlike 
other  forms  of  multivariate  analyses,  no  assumptions  are 
needed  about  the  distribution  of  the  variates,  hypothetical 
populations,  except  when  significance  tests  are  of 
interest.     Likewise,  Gutjahr  (1985)  and  Olea  (1975)  have 
stated  that  the  assumption  of  normality  is  not  needed  in 
geostatistics .     Stationarity  is  the  most  important 
assumption  in  geostatistics,  although  Burrough  (1983a) 
indicated  that  stationarity  is  very  difficult  to  achieve. 

Normality,  therefore,  is  not  required  for  PCA  and 
geostatistics.     However,  the  test  of  normality  was 
performed  because  a  large  number  of  soil  variability 
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studies  have  implicitly  assumed  a  normal  distribution  of 
soil  properties  without  using  any  statistical  test  to 
justify  this  assumption.     Also,  a  large  data  base  was 
available.  Thus,  a  conclusion  such  as  "data  were  non- 
normally  distributed  because  of  the  small  number  of 
observations"  has  no  validity  in  this  study. 

There  are  two  main  tests  of  normality.     One  is  a 
graphical  method  based  on  histograms  or  plots  of  values 
measured  on  probability  paper.     The  other  one  is  based  on  a 
guantitative  measure  such  as  the  Kolmogorov  test.     Rao  et 
al.   (1979)  indicated  that  graphical  methods  have  specific 
drawbacks.     First,  they  often  rely  on  visual  inspection, 
and  thus  are  subject  to  human  error.     Second,  as  graphical 
methods  are  not  based  on  guantitative  measures,  an 
objective  statistical  evaluation  of  the  goodness-of-f it  of 
the  theoretical  distribution  to  the  measured  data  is  not 
possible.     Conseguently,  the  normality  analysis  was  based 
on  more  a  guantitative  measure  rather  than  a  graphical 
method. 

The  data  were  tested  against  a  theoretical  normal 
distribution  with  mean  and  variance  egual  to  the  sample 
mean  and  variance.     Skewness,  kurtosis,  the  Kolmogorov  D 
statistic,  and  plot  of  data  were  used  to  test  the  null 
hypothesis  that  the  input  data  values  were  normally 
distributed  (SAS  Institute  Inc.,  1982a). 
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When  the  distribution  is  not  symmetric,  the  skewness 
can  be  positive  (skewed  to  the  right)  or  negative  (skewed 
to  the  left).     Kurtosis  refers  to  the  degree  of  peakedness 
of  a  freguency  distribution  (Silk,  1979).     A  heavy  tailed 
distribution  has  positive  kurtosis.     Flat  distributions 
with  short-tails  or  when  almost  all  data  values  appear  very 
close  to  the  mean  have  negative  kurtosis.     The  measure  of 
skewness  and  kurtosis  for  a  normally  distributed  population 
is  zero  (SAS  Institute  Inc.,  1982a). 

A  significance  level  (a)  value  of  0.15  was  selected  as 
the  criterion  for  acceptance  or  rejection  of  the  null 
hypothesis  (H  =  Normal).     When  normality  is  tested  the 
interest  is  in  accepting  the  null  hypothesis.     This  is  in 
contrast  to  most  situations  when  the  interest  is  in 
rejecting  the  null  hypothesis.     For  these  reason,  Rao  et 
al.   (1979)  proposed  an  a  value  between  0.15  and  0.20  in 
order  to  have  a  balance  between  type  I  and  II  errors. 

Statistical  moments  for  each  soil  property  were 
computed  (Table  2).    Most  variables  had  large  coefficients 
of  variation  (C.V.).     Soil  pH  (water  and  KC1)  had  the 
lowest  variation,  reflecting    uniform  condition  of  pH,  in 
this  case  the  acidity. 

Other  soil  properties  had  a  large  C.V..    Most  of  these 
soil  properties  are  naturally  related,  and  the  large  C.V.s 
were  mutually  influenced.     For  example,  the  amount  of 
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Table  2.     Statistical  moments  of  soil  properties  studied 
and  Kolmogorov  test. 


Mean  Variance    C.V  Skewness  Kurtosis  D: Normal  PROB>D 

(%) 


TH 

30.2 

365.8 

63 . 

3 

1         /I  A 

1.44 

"1      ft  A 

2.92 

0 . 

1  ft 

12 

< 

ft  1 

.  01 

VC 

•1  ft 

1 . 2 

5 . 4 

ion 

189 . 

6 

4.59 

29.3 

U  . 

*5  A 

30 

< 

A  1 
.  01 

C 

/-  A 

6 .  4 

39.5 

97 . 

5 

"1  C 

1.36 

1.53 

0  . 

15 

< 

A  1 
.01 

M 

1  *1  ft 

17 . 0 

125.6 

65 . 

9 

0.86 

1 . 11 

u . 

A  C 

Ob 

< 

A  1 
.01 

F 

32.7 

209.2 

A  A 

44 . 

2 

ft       >1  ft 

0.40 

A     1  A 

-U  .  1U 

A 

0  . 

A  *7 
0  / 

< 

A  1 
.01 

VF 

12 . 9 

68.8 

64 . 

4 

1.11 

1        ft  ft 

1.92 

0  . 

ft  "7 

07 

< 

.  01 

TS 

70.0 

354.8 

26. 

9 

-1.18 

1.63 

0. 

08 

< 

.01 

Silt 

10.7 

78.4 

83. 

0 

3.92 

27.0 

0. 

16 

< 

.01 

Clay 

19.4 

260.9 

83. 

3 

1.33 

2.11 

0. 

12 

< 

.01 

PHI 

5.1 

0.35 

11. 

6 

-0.67 

13.5 

0. 

12 

< 

.01 

PH2 

4.2 

0.28 

12. 

5 

-0.19 

9.91 

0. 

10 

< 

.01 

OC 

0.43 

0.52 

167. 

8 

3.41 

14.1 

0. 

28 

< 

.01 

Ca 

0.94 

5.51 

250. 

3 

6.15 

49.1 

0. 

34 

< 

.01 

Mg 

0.36 

0.81 

253  . 

2 

10.8 

154.3 

0. 

35 

< 

.01 

Na 

0.03 

0.002 

130. 

8 

3.62 

25.5 

0. 

22 

< 

.01 

K 

0.06 

0.009 

170. 

5 

3.72 

19.3 

0. 

28 

< 

.01 

TB 

1.38 

8.76 

213. 

7 

6.05 

49.1 

0. 

32 

< 

.01 

EXT 

5.61 

30.8 

98. 

9 

2.79 

12.8 

0. 

16 

< 

.01 

CEC 

7.01 

49.1 

99. 

9 

2.83 

11.0 

0. 

18 

< 

.01 

BS 

18.8 

399.7 

106. 

2 

1.78 

2.90 

0. 

18 

< 

.01 

*  See  Abbreviations,  pp.  xii-xiii 


TH  is    expressed  in  cm;  VC,  C,  M,  F,  VF,  TS,  silt,  clay, 
OC,  and  BS  are  expressed  as  %;  Ca,  Mg,  Na,  K,  TB,  EXT, 
and  CEC  are  expressed  as  cmol/kg. 


n  =  991 
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extractable  cations  (Ca,  Mg,  Na,and  K)  depends  largely  on 
the  CEC,  which  in  turn  depends  on  particle  size. 

The  large  variation  in  particle  size  (very  coarse, 
coarse,  medium,  fine,  and  very  fine  sand  fractions;  silt; 
and  clay  contents)  was  influenced  by  the  diversity  of 
Paleudults  (Appendix  A)  and  the  presence  of  horizons  with 
guite  different  textures.     Paleudults  had  variable 
thickness  of  coarse-textured    horizons  (Typic,  Arenic,  and 
Grossarenic  Subgroups)  overlying  fine-textured  argillic 
horizons . 

Most  of  the  soil  properties  studied  did  not  have 
skewness  and/or  kurtosis  close  to  zero.     The  exception  was 
fine  sand.  Also,  the  histogram  and  normal  probability  plot 
(Figure  6)  indicated  that  fine  sand  values  were  normally 
distributed,  but  when  the  Kolmogorov  test  was  performed,  it 
indicated  that  fine  sand  had  a  large  probability  of  being 
non-normal.     The  significance  probability  (PROB>D)  of  the 
Kolmogorov  D  statistic  (DiNormal)  was  smaller  than 
a  =  0.15.     So,  the  null  hypothesis  was  rejected  for  fine 
sand. 

Results  of  the  Kolmogorov  test  indicated  that  the  soil 
properties  studied  had  a  non-normal  distribution.  Results 
of  the  Kolmogorov  test  were  also  supported  by  the 
histograms  and  normal  probability  plots.  Histograms 
revealed  that  distribution  of  values  by  soil  property  did 
not  have  the  characteristic  bell-shaped  curve  of  a  normal 
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Figure  6. 


Histogram  (a)  and  normal  probability 
plot   (b)   of  fine  sand  content. 
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distribution.     In  addition,  normal  distribution  plots 
indicated  a  lack  of  correspondence  between  the  observed  and 
the  theoretical  distributions,  for  example  organic  carbon 
content  (Figure  7). 

Transformations  (logarithmic,  arcsine,  or  sguare  root) 
were  not  made  on  the  original  data  because  the  objective 
was    to  accept  or  reject  the  normal  distribution.  In 
addition,  interpretation  of  transformed  data  is  complex. 

These  results  could  support  the  fact  that  there  were 
systematic  patterns  of  soil  properties;  observations  were 
not  independent  but  associated  within  certain  distance. 
Patterns  of  soil  properties  influenced  the  probability 
distribution. 

The  presence  of  trends  in  soil  properties  associated 
with  landscape  position  has  been  recognized.    Walker  et  al. 
(1968)  pointed  out  that  such  trends  suggested  that  the 
analysis  of  soil  data  in  terms  of  mean  and  standard 
deviation  is  guestionable ,  since  the  assumption  of  random 
variation  does  not  appear  valid. 

In  addition,  Hole  and  Campbell  (1985)  indicated  that 
if  place-to-place  variation  occurred  at  random,  without 
elements  of  organization  and  order,  mapping  efforts  could 
proceed  only  with  the  greatest  difficulty  because 
information  and  experience  gained  at  one  location  would 
have  little  predictive  value  at  new  locations.     Under  such 
circumstances  each  mapping  problem  would  be  unigue  because 
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Figure  7.     Histogram  (a)   and  normal  probability  plot  (b) 
of  organic  carbon  content. 
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of  the  lack  of  a  consistent  geographic  order  that  can  be 
transferred  from  previous  experience  to  analogous  settings. 

Principal  Component  Analysis 

Twenty  soil  properties  were  initialy  selected  to  study 
the  soil  spatial  variability  using  geostatistics . 
Geostatistical  analysis  is  time  consuming  and  complex. 
Conversely,  all  soil  properties  do  not  have  the  same  degree 
of  importance  to  quantify  the  spatial  variability  of  soils. 
Therefore,  reduction  of  soil  properties  was  necessary  for 
further  analysis. 

PCA  was  used  as  an  unbiased  method  to  select  the  most 
important  soil  properties.     Important  soil  properties  were 
defined  as  those  that  explained  a  large  proportion  of  the 
total  variance. 

Two  sets  of  data  were  employed  for  this  analysis.  One 
set  was  composed  by  the  weighted  average  of  selected  soil 
properties  in  individual  pedons.     Horizon  thickness  was 
used  as  the  weighting  criterion.     Information  is  lost  when 
averages  are  used.  Therefore,  a  second  set  of  data  composed 
of  selected  soil  properties  from  the  surface  A  horizon  were 
used. 

Principal  Component  Analysis  for  Standardized  Weighted  Data 

A  basic  assumption  of  PCA  is  that  variables  have 
homogeneous  variances  (Afifi  and  Clark,  1984;  Webster, 
1977).     The  soil  properties  studied  had  different  scales  of 
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measurement  (thickness  was  measured  in  cm;  particle  size, 
organic  carbon  content,  and  base  saturation  in  %;  and 
extractable  cations,  total  bases,  extractable  acidity,  and 
CEC  in  cmol/kg) .     Therefore,  it  is  difficult  to  compare 
them.     For  this  reason,  all  soil  properties  were 
standardized  to  mean  zero  and  variance  one. 

One  measure  of  the  amount  of  information  conveyed  by 
each  PC  is  in  its  variance  (eigenvalue).     For  this  reason, 
the  PCs  are  commonly  arranged  in  order  of  decreasing 
variance  (Table  3).     The  most  informative  PC  is  the  first 
and  the  least  informative  is  the  last. 

The  criterion  for  selecting  PCs  was  stated  in  the 
Materials  and  Methods  section.     The  first  five  PCs  were 
selected  for  further  analysis.     Each  of  them  explained  more 
than  5%  of  the  total  variance  (Table  3).     The  first  five 
PCs  together  explained  more  than  73%  of  the  total  variance. 

Different  interpretative  analyses  were  performed  to 
select  the  soil  properties  that  contributed  the  most  to  the 
total  variance.     A  very  informative  display  of  the 
relationships  between  soil  properties  and  PCs  were  plots 
(Figure  8).     The  most  important  soil  properties  were  those 
with  large  values  located  closer  to  the  axis  of  the  PC. 

Some  properties  did  not  have  a  clear  contribution  to 
an  individual  PC,  such  as  coarse  and  medium  sand  fractions 
and  Mg  content  (Figure  8).     The  axes  of  PCs  were  rotated 
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Table  3 .     Proportion  of  total  variance  explained  by  each 
principal  component. 


Principal 
Component 

Eigenvalue 

Proportion 
(%)  * 

Cumulative 
Proportion 

1 

5, 

.9119 

29, 

.56 

29 

.56 

2 

3, 

.0450 

15, 

.23 

44 

.79 

3 

2. 

.5310 

12, 

.65 

57 

.44 

4 

1, 

.9153 

9, 

.57 

67, 

.01 

5 

1, 

.2385 

6, 

.19 

73, 

.20 

6 

0, 

.8040 

4, 

,02 

77, 

.22 

7 

0. 

.7824 

3, 

,91 

81, 

.13 

8 

0, 

.6933 

3, 

,47 

84, 

.60 

9 

0, 

,6382 

3. 

,19 

87, 

,79 

10 

0, 

,5872 

2. 

,94 

90, 

.73 

11 

0, 

,4871 

2. 

,44 

93, 

,17 

12 

0. 

,4377 

2, 

,19 

95, 

,36 

13 

0. 

,3458 

1, 

,73 

97, 

,09 

14 

0. 

,2393 

1. 

,20 

98. 

,29 

15 

0. 

,2020 

1. 

,01 

99. 

,30 

16 

0. 

,1209 

0. 

,60 

99. 

,90 

17 

0. 

,0168 

0. 

08 

99. 

,98 

18 

0. 

,0037 

0. 

02 

100. 

,00 

19 

0. 

0002 

0. 

00 

100. 

,00 

20 

0. 

0000 

0. 

00 

100. 

00 

*  Proportion  of  the  total  variance. 
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toward  clusters  of  those  soil  properties  with  no  clear 
contribution  to  an  individual  PC. 

An  orthogonal  rotation  (Varimax  rotation)  was 
employed  (Figure  9).     For  this  specific  example  varimax 
rotation  showed  that  those  soil  properties  with  initially 
no  clear  contribution  to  an  individual  PC  were  closer  to 
the  axis  of  the  principal  component  1  (PCI).     This  analysis 
was  complemented  with  a  guantitative  selection  of 
eigenvectors  (coefficients  of  the  linear  combination  of 
soil  variables). 

Eigenvectors  were  calculated  for  each  PC  (Table  4). 
The  criterion  for  selecting  important  eigenvectors  was  also 
stated  in  the  Materials  and  Methods  section.  Selected 
eigenvectors  for  PCI  had  an  absolute  value  larger  than  the 
selection  criterion  value  (Sc)  0.2056.     Soil  properties 
selected  as  important  constituents  of  the  PCI,  based  on  the 
Sc  value,  were  medium  and  total  sand  contents,  clay 
content,  Ca,  Mg,  Na,  and  K  contents,  total  bases, 
extractable  acidity,  and  CEC.     Eigenvectors  selected  for 
principal  component  2  (PC2),  principal  component  3  (PC3), 
principal  component  4  (PC4),  and  principal  component  5 
(PC5)  had  absolute  values  larger  than  0.2865,  0.3143, 
0.3613,  and  0.4493,  respectively. 

Each  PC  is  defined  as  a  linear  combination  of  the 
standardized  variables,  but  collinearity  among  variables 
may  be  a  problem.     An  analysis  of  collinearity  was 
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Table  4.     Eigenvectors  of  correlation  matrix  for 
standardized  weighted  soil  properties. 


Soil  *    Principal  Component   

Property        12  3  4  5 


rpTT 

in 

n 

U  . 

U  z  a  z 

a 

U  . 

uuy  y 

a 
-U  . 

1  jib 

-  u . 

A  1  A  1 

4.341 

-u . 

1  Jz4 

VL 

a 

u . 

1  O  A  "3 

n 

U  . 

Ub  /Z 

a 

-  u . 

j  b  y  d 

A 

U  . 

^  "7  53  C 

z  /  ob 

-u . 

z  b  lb 

L 

A 

-  u . 

1  o  b  b 

n 

U  . 

i  "7  c  n 
1  /bU 

A 

-u . 

4bJb 

A 

U  . 

lb  /u 

A 

U  . 

A  C  *3  Q 

Ub  Jo 

TVT 

M 

A 

-u . 

zjy4 

a 

(J  . 

1  o  o  c 

-u . 

0  ~>  ^  0 

1  z  j  J 

A 

u . 

uy  uu 

A 

U  . 

zobU 

El 

r 

-u . 

iby  4 

a 

u . 

Ulb  / 

A 

U  . 

4b  J  b 

A 

-u . 

U4bl 

A 

U  . 

A  o  O  C 

zoob 

Vr 

a 

U  . 

Ulzl 

a 

-u . 

1  Q  £  1 

iy  bi 

A 

U  . 

4U1  / 

A 

u . 

1  C  C  A 

lbbu 

A 

-u . 

jo  /o 

mo 

TS 

a 

-u . 

u . 

i  a  C  i 
1  ZD  1 

A 

U  . 

lb  Jo 

A 

u . 

iiy  i 

A 

U  . 

z  /4y 

Silt 

0. 

1623 

-0. 

1796 

-0. 

0478 

0. 

4111 

-0. 

3917 

Clay 

0. 

3109 

-0. 

0604 

-0. 

1687 

-0. 

3341 

-0. 

1507 

PHI 

-0 . 

1816 

0 . 

3867 

0 . 

I860 

-0 . 

0321 

-0 . 

0877 

PH2 

-0. 

1018 

0. 

4156 

0. 

1352 

0. 

1275 

-0. 

1812 

OC 

0. 

1027 

-0. 

0814 

0. 

0655 

0. 

5700 

0. 

1678 

Ca 

0. 

2654 

0. 

3372 

0. 

0558 

0. 

0649 

-0. 

0258 

Mg 

0. 

2552 

0. 

2061 

-0. 

0143 

-0. 

0635 

0. 

1237 

Na 

0. 

2590 

0. 

0205 

0. 

0103 

0. 

0525 

0. 

2423 

K 

0. 

2577 

0. 

1255 

0. 

0119 

0. 

0861 

0. 

1200 

TB 

0. 

3007 

0. 

3353 

0. 

0407 

0. 

0343 

0. 

0267 

EXT 

0. 

3131 

-0. 

2098 

-0. 

0584 

0. 

0920 

0. 

2795 

CEC 

0. 

3755 

-0. 

0274 

-0. 

0307 

0. 

0981 

0. 

2194 

BS 

0. 

0829 

0. 

4214 

0. 

0785 

0. 

0198 

-0. 

2434 

Sc** 

0. 

2056 

0. 

2865 

0. 

3143 

0. 

3613 

0. 

4493 

*    See  Abbreviations,  pp.  xii-xiii. 

\ 

**    Sc  =  0.5  +  (Principal  Component  eigenvalue) 

All  underlined  values  had  an  absolute  value  larger  than 
its  corresponding  Sc.     Underlined  values  were  selected 
for  further  study. 
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performed  for  those  soil  properties  previously  selected. 
Soil  properties  with  a  Tolerance  (T)  <  0.01  were  considered 
to  be  highly  intercorrelated  and  were  also  excluded  (Table 
5).     PC5  was  not  included  in  the  analysis  of  collinearity 
because  all  eigenvectors  had  an  absolute  value  smaller  than 
Sc. 

According  to  this  criterion  Ca,  Mg,  Na,  and  K 
contents,  total  bases,  extractable  acidity,  and  CEC  were 
highly  intercorrelated  for  PCI.     Similar  reduction  of 
variables  was  applied  to  other  PCs. 

A  final  reduction  was  made  by  calculating  correlation 
coefficients  between  soil  properties  and  PCs  (Table  6).  A 
large  correlation  coefficient  (|0.75|)  was  initially 
selected  as  criterion  to  the  reduce  even  more  the  number  of 
soil  properties.     Based  on  the  correlation  coefficient, 
fine  sand,  total  sand,  clay,  and  organic  carbon  contents 
were  selected.     Other  soil  properties  also  had  a  large 
correlation  coefficient,  but  they  were  previously 
eliminated  because  of  the  small  eigenvectors  or  the  low 
tolerance. 

In  summary,  fine  sand,  total  sand,  clay,  and  organic 
carbon  contents  were  selected  for  further  analysis.  The 
selection  was  based  on  analyses  of  PCs  plots,  PCs  rotated 
axes  plots,  guantitative  selection  of  larger  eigenvectors, 
collinearity  tests,  and  computation  of  correlation 
coefficients  between  soil  properties  and  PCs.     The  selected 
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Table  5.     Tolerance  of  standardized  weighted  soil 
properties  by  principal  component. 


Principal  Component 


1 

2 

3 

4 

* 

T 

* 

T 

* 

T 

* 

T 

M 

0.69 

PHI 

0.61 

vc 

0.45 

TH 

0.90 

TS 

0.11 

PH2 

0.54 

c 

0.22 

Silt 

0.73 

Clay 

0.14 

Ca 

0.08 

M 

0.36 

OC 

0.73 

Ca 

<.01 

TB 

0.08 

F 

0.78 

Ma 

<  01 

RC 

u  o 

U  .  .J  O 

VP" 
v  r 

Na 

<.01 

K 

<.01 

TB 

<.01 

EXT 

<.01 

CEC 

<.01 

*  See  Abbreviations,  pp.  xii-xiii. 


T  =    1  -  R     (R  =  coefficient  of  multiple  correlation) 
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Table  6.     Correlation  coefficients  between  standardized 
weighted  soil  properties  and  principal 
components . 


Soil  *    Principal  Component   

Property  12  3  4  5 


TH 

0.0686 

ft       ft  1  T  ft 

0 . 0173 

-0 . 2157 

-0.6008 

-0  . 

1473 

VC 

-0.2538 

ft     ft  ft  ft  ft 

0.0998 

/•»       C  ft  ^1  ft 

-0 . 5878 

0.3856 

-0 . 

2911 

c 

ft,       A  C  1  ft 

-0.4513 

rt     i     /■"  ft, 

0.3069 

ft         1  "\  *7  yl 

-0 .7374 

0.2173 

-0 . 

0599 

M 

ft     c  ft  ft  *i 

-0.5821 

ft      *^  4  /■*  J 

0 .3464 

-0 . 5143 

0 .1246 

0  . 

3183 

F 

/•N          A  ^    t  ft, 

-0 . 4119 

0.0274 

ft       ~7  y*"  ft  A 

0.7694 

-0.0624 

0  . 

3212 

VF 

ft     ft  ft  ft  a 

0.0294 

-0.3422 

0.6391 

0 . 2145 

-0 . 

4315 

TS 

-0.8357 

0.2183 

0.2606 

0.1648 

0. 

3058 

Silt 

0.3946 

-0.3134 

ft      a  i  r  i 

-0.0757 

0 .5689 

-0 . 

4359 

Clay 

0.7559 

-0.1054 

-0.2684 

-0.4624 

-0. 

1677 

PHI 

-0.0443 

0.6748 

0.2959 

-0.0444 

-0. 

0976 

PH2 

0.2475 

0.7252 

0.2151 

0.1765 

-0. 

2017 

oc 

0.2497 

-0.1420 

0.1042 

0 . 7889 

0 . 

1867 

Ca 

0.6453 

0.5884 

0.0888 

0.0898 

-0. 

0287 

Mg 

0.6205 

0.3596 

-0.0227 

-0.0879 

0. 

1377 

Na 

0.6297 

0.0358 

0.0164 

0.0727 

0. 

2697 

K 

0.6266 

0.2188 

0.0191 

0.1192 

0. 

1335 

TB 

0.7311 

0.5851 

0.0647 

0.0475 

0. 

0297 

EXT 

0.7613 

-0.3661 

-0.0929 

0.1275 

0. 

3109 

CEC 

0.9130 

-0.0478 

-0.0488 

0.1358 

0. 

2442 

BS 

0.2016 

0.7353 

0.1249 

0.0274 

-0. 

2708 

*  See 

Abbreviations,  pp.  xii 

-xiii. 

All 

underlined 

values  had 

an  absolute 

value  > 

0.75 

93 

soil  properties  were  those  that  explained  most  of  the 
variance  of  the  total  set  of  data. 

Principal  Component  Analysis  for  A  Horizon  Standardized 
Data 

The  first  five  PCs  explained  approximately  74%  of  the 
total  variance  for  the  A  horizon  (Table  7).  Similar 
analyses  of  plots  as  indicated  earlier  for  standardized 
weighted  average  values  were  used. 

Eigenvectors  with  absolute  values  larger  than  a  Sc 
value  of  0.2126,  0.2553,  0.3070,  0.3857,  and  0.4605  for 
PCI,  PC2,  PC3,  PC4,  and  PC5,  respectively,  were  selected 
(Table  8) . 

Analysis  of  collinearity  showed  that  only  total  sand 
had  a  T  value  <  0.01  (Table  9),  therefore,  A  horizon  total 
sand  was  eliminated  for  further  analysis.     Silt  and  clay 
also  had  low  T  values  indicating  some  correlation  among 
those  properties.     After  computing  the  correlation 
coefficient  between  soil  properties  and  PCs  (Table  10), 
clay  content  and  CEC  were  selected.     They  had  a 
correlation  coefficient  larger  than  |0.75|. 

While  organic  carbon  content  in  the  A  horizon  is  an 
important  property  it  was  not  selected  by  the  PCA. 
Therefore,  it  may  be  concluded  that  organic  carbon  content 
was  not  as  important  as  clay  and  CEC  in  explaining  the 
total  variance. 

Two  kinds  of  A  horizons  were  present  (Ap  and  Al).  The 
Ap  horizon  is  influenced  by  management  conditions  and  the 


94 


Table  7.  Proportion  of  total  variance  explained  by  each 
principal  component  for  standardized  A  horizon 
data. 


Principal  Eigenvalue    Proportion  Cumulative 

Component  (%)  *  Proportion 


1 

5.5308 

27.65 

27.65 

2 

3.8348 

19.17 

46.82 

3 

2.6524 

13.26 

60.08 

4 

1.6807 

8.40 

68.48 

5 

1.0856 

5.43 

73.91 

6 

0.9913 

4.96 

78.87 

7 

0.9273 

4.64 

83.51 

8 

0.7856 

3.93 

87.44 

9 

0.6752 

3.08 

90.52 

10 

0.3676 

1.84 

92.36 

11 

0.3144 

1.57 

93.93 

12 

0.2913 

1.46 

95.39 

13 

0.2078 

1.04 

96.43 

14 

0.1913 

0.96 

97.39 

15 

0.1796 

0.90 

98.29 

16 

0.1335 

0.72 

99.01 

17 

0.0783 

0.62 

99.63 

18 

0.0653 

0.33 

99.96 

19 

0.0073 

0.04 

100.00 

20 

0.0000 

0.00 

100.00 

*  Proportion  of  the  total  variance. 
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Table  8.     Eigenvectors  of  correlation  matrix  for 
standardized  properties  of  A  horizon. 


Soil  *    Principal  Component   

Property        12  3  4  5 


TH 

-0 . 

0509 

0 . 

0935 

-0 . 

0047 

-0 . 

2287 

0 . 

0146 

VC 

-0 . 

1167 

0  . 

2425 

-0 . 

3346 

0  . 

0244 

-0. 

1879 

c 

-0 . 

1566 

0 . 

2945 

-0 . 

3872 

0 . 

0516 

-0. 

0386 

M 

-0 . 

2206 

0 . 

2533 

-0 . 

2541 

0 . 

1803 

0 . 

2296 

F 

-0 . 

1573 

-0 . 

2129 

0 . 

4110 

0 . 

2365 

0 . 

0394 

VF 

0 . 

0461 

-0 . 

2672 

0 . 

2912 

-0. 

1316 

-0  . 

5020 

TS 

-0 . 

3654 

0 . 

0104 

0 . 

1294 

0. 

3020 

-0  . 

0999 

Sil 

0. 

3093 

-0. 

1332 

-0. 

1973 

-0. 

2671 

-0. 

0016 

Cla 

0. 

3280 

0. 

1182 

-0. 

0222 

-0. 

2658 

0. 

1782 

jrn± 

_  n 
u  • 

n  r  1 

n 

u . 

jU  JO 

u  . 

j  £.  y  U 

u . 

U  j  /  ? 

U  . 

UU1U 

PH2 

-0. 

0958 

0. 

3251 

0. 

2977 

-0. 

1084 

0. 

1614 

OC 

0. 

2990 

-0. 

1663 

-0. 

0832 

0. 

0868 

0. 

1278 

Ca 

0. 

2397 

0. 

2909 

0. 

2653 

-0. 

0275 

0. 

1489 

Mg 

0. 

2157 

0. 

2739 

0. 

2156 

-0. 

1050 

0. 

0385 

Na 

-0. 

0251 

-0. 

1989 

0. 

0893 

0. 

2417 

0. 

6454 

K 

0. 

2722 

0. 

0417 

0. 

0078 

0. 

4142 

0. 

0353 

TB 

0. 

1644 

0. 

2427 

0. 

0529 

0. 

4085 

-0. 

3198 

EXT 

0. 

2051 

0. 

3312 

0. 

1259 

0. 

0017 

-0. 

0302 

CEC 

0. 

3238 

-0. 

1661 

-0. 

1199 

0. 

2168 

0. 

0189 

BS 

0. 

2971 

0. 

0946 

-0. 

0441 

0. 

3565 

-0. 

1662 

Sc  ** 

0. 

2126 

0. 

2553 

0. 

3070 

0. 

3857 

0. 

4605 

*  See  Abbreviations,  pp.  xii-xiii. 

i 

**  Sc  =  0.5  +  (Principal  Component  eigenvalue)2 

All  underlined  values  had  an  absolute  value  larger  than 
its  corresponding  Sc.     Underlined  values  were  selected 
for  further  study. 
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Table  9.     Tolerance  of  standardized  properties  of 
A  horizon  by  principal  component. 


Principal  Component 


1 

2 

3 

4 

5 

* 

T 

* 

T 

* 

T 

* 

T 

* 

T 

M 

u .  by 

C 

a    £  i 
0.61 

vc 

0 

a  n 
.  27 

K 

0.64 

VF 

A       A  A 

0.99 

TS 

<  .  01 

VF 

0 .  67 

c 

0 

A  >l 

.  24 

TB 

0.64 

Na 

A       A  A 

0.99 

C*  A    "1  4- 

Silt 

r\    a  a 

0.09 

PHI 

0.38 

F 

0 

.73 

Clay 

a    a  a 

0.03 

PH2 

0.36 

PHI 

0 

.  96 

OC 

0.23 

Ca 

0.23 

Ca 

0.28 

Mg 

0.39 

Mg 

0.35 

EXT 

0.36 

K 

0.56 

CEC 

0.14 

BS 

0.32 

*  See  Abbreviations,  pp.  xii-xiii. 

T  =  1  -  R    (R  =  coefficient  of  multiple  correlation). 
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Table  10.     Correlation  coefficient  between  standardized 

properties  of  A  horizon  and  principal  component. 


Soil  *    Principal  Component   

Property  12  3  4  5 


TH 

-0. 

1197 

0. 

1831 

-0. 

0077 

-0. 

2964 

0. 

0152 

VC 

-0. 

2744 

0. 

4748 

-0. 

5450 

0. 

0316 

-0. 

1957 

C 

-0. 

3683 

0. 

5768 

-0. 

6306 

0. 

0669 

-0. 

0403 

M 

-0. 

5189 

0. 

4961 

-0. 

4138 

0. 

2337 

0. 

2393 

F 

-0. 

3699 

-0. 

4169 

0. 

6693 

0. 

3066 

0. 

0411 

VF 

0. 

1084 

-0. 

5233 

0. 

4743 

-0. 

1706 

-0. 

5230 

TS 

-0. 

8593 

0. 

0203 

0. 

2107 

0. 

3915 

-0. 

1042 

Silt 

0. 

7274 

-0. 

2609 

-0. 

3213 

-0. 

3462 

-0. 

0017 

Clay 

0. 

7714 

0. 

2315 

-0. 

0362 

-0. 

3446 

0. 

1857 

PHI 

-0. 

2025 

0. 

5988 

0. 

5358 

-0. 

0751 

-0. 

0010 

PH2 

-0. 

2254 

0. 

6367 

0. 

4849 

-0. 

1406 

0. 

1681 

OC 

0. 

7031 

-0. 

3256 

-0. 

1355 

0. 

1126 

0. 

1332 

Ca 

0. 

5639 

0. 

5696 

0. 

4321 

0. 

0357 

0. 

1551 

Mg 

0. 

5073 

0. 

5364 

0. 

3511 

-0. 

1362 

0. 

0402 

Na 

-0. 

0589 

-0. 

3895 

0. 

1454 

0. 

3134 

0. 

6724 

K 

0. 

6403 

0. 

0817 

0. 

0127 

0. 

5371 

0. 

0368 

TB 

0. 

3865 

0. 

4753 

0. 

0861 

0. 

5296 

-0. 

3332 

EXT 

0. 

4824 

0. 

6487 

0. 

2051 

0. 

0023 

-0. 

0315 

CEC 

0. 

7615 

0. 

3253 

-0. 

1954 

0. 

2811 

0. 

0197 

*  See  Abbreviations,  pp.  xii-xiii. 

All  underlined  values  had  an  absolute  value  >  0.75. 
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Al  horizon  is  found  in  relatively  natural  conditions. 
Thus,  the  PCA  was  employed  separately  on  these  two  classes 
of  A  horizons. 

All  steps  previously  described  were  followed  in  the 
PCA  for  these  two  groups  of  A  horizons.  The  final  selection 
of  soil  properties  by  correlation  coefficients  (Tables  11 
and  12)  revealed  that  organic  carbon  content  and 
extractable  acidity  were  two  important  properties  of  the  Al 
horizon  for  PCI  (these  soil  properties  represented 
approximately  39%  of  the  total  variance).     The  PCA  revealed 
the  importance  of  organic  carbon  content  and  the  natural 
acidic  conditions  reflected  by  the  extractable  acidity 
values. 

Base  saturation  was  the  most  important  property  of  the 
Ap  horizon  for  PCI  (Table  12).     Base  saturation  represented 
approximately  24%  of  the  total  variance.     PCA  revealed, 
therefore,  the  influenced  of  management  conditions  (liming) 
on  the  Ap  horizons. 

The  PC2  for  the  Ap  horizon  also  indicated  organic 
carbon  content  was  an  important  property.     For  this  reason 
organic  carbon  content  was  also  selected.     Other  soil 
properties  were  not  selected  because  they  were  previously 
excluded  by  the  PCA.    Organic  carbon  and  clay  contents  were 
selected  as  important  soil  properties  for  both  sets  of 
data,  weighted  average  and  A  horizon  values. 
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Table  11.     Correlation  coefficient  between  standardized 
properties  of  Al  horizon  and  principal 
components . 

Soil  *    Principal  Component   

Property  12  3  4  5 


in 

-0 

0  1  R  S 

\J  _L  O  -J 

99RR 

n 

9  1  Q  R 
z  _L  y  o 

A 

u  . 

a  ^  "5 1 

u  . 

")  7  Q  A 

z.  /  y  4 

vr 

V 

-0 

^RQQ 
j  o  y  y 

1  9R7 

i.  6.  0  / 

n 

u  . 

67  ^  A 

n 

u . 

16  /  Z 

-0 

■^96.8 

n 

u  • 

ID  J  J 

n 

/DUD 

u . 

m  9  a 

1  "3  C  A 

M 

-D 

n 

\j  ■ 

9  9fi6 

u  ■ 

Z  D  /  1 

u  . 

i  a  i  i 

TT 
J. 

-0 
u  • 

-0 
u  • 

I  1 

II  J  J 

764,4 
1  O  4  ft 

UUUl 

u  . 

z  y  y  b 

VF 

0 . 

1727 

-0. 

3194 

-0 . 

6310 

0 . 

2714 

-0 

5134 

TS 

-0. 

9064 

-0. 

0054 

-0. 

1940 

-0. 

0038 

0. 

1558 

Silt 

0. 

7851 

-0. 

2490 

0. 

1689 

-0. 

0236 

-0. 

3054 

Clay 

0. 

8310 

0. 

2766 

0. 

1819 

0. 

0906 

0. 

0348 

PHI 

-0. 

2737 

0. 

6579 

-0. 

2146 

0. 

2636 

0. 

0405 

PH2 

-0. 

3744 

0. 

6891 

-0. 

0656 

0. 

3259 

0. 

1152 

OC 

0. 

7648 

-0. 

3696 

0. 

0709 

-0. 

0410 

0. 

1346 

Ca 

0. 

6051 

0. 

7386 

-0. 

0351 

0. 

0318 

-0. 

0022 

Mg 

0. 

6332 

0. 

5895 

-0. 

0881 

-0. 

1155 

0. 

0607 

Na 

0. 

7245 

-0. 

0781 

0. 

0625 

-0. 

0307 

0. 

1225 

K 

0. 

8072 

0. 

3281 

0. 

0261 

-0. 

1239 

0. 

1096 

TB 

0. 

6458 

0. 

7187 

-0. 

0411 

0. 

0035 

0. 

0138 

EXT 

0. 

7792 

-0. 

4500 

0. 

1489 

0. 

1139 

0. 

1982 

CEC 

0. 

8902 

-0. 

2024 

0. 

1232 

0. 

1042 

0. 

1845 

BS 

0. 

0278 

0. 

7952 

-0. 

1907 

0. 

0247 

-0. 

2019 

*  See  Abbreviations,  pp.  xii-xiii. 

All  underlined  values  had  an  absolute  value  >  0.75. 
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Table  12.     Correlation  coefficient  between  standardized 
properties  of  Ap  horizon  and  principal 
components . 


Soil  *    Principal  Component   

Property  12  3  4  5 


TH 

0.0867 

-0 . 

2165 

-0 . 

1594 

-0 . 

0704 

0 . 

8748 

VC 

0.2069 

-0 . 

6088 

-0 . 

1  r\  f\  o 

1908 

0  . 

3882 

-0 . 

"1    A  1  '"4 

1432 

C 

0.2684 

-0 . 

/-inn 

6478 

-0 . 

3617 

0 . 

5061 

-0  . 

f\  A    A  f 

0446 

M 

0.0718 

-0 . 

/**    A    A  O 

6448 

-0 . 

2155 

0 . 

5817 

0  . 

0229 

c 

— n  A1AA 

u . 

n 

u  . 

fil  9  4 

UU4U 

u  . 

9  (177 

VF 

-0.2499 

0. 

4575 

0. 

3277 

-0. 

5642 

-0. 

1930 

TS 

-0.5737 

0. 

0372 

0. 

6245 

0. 

4727 

0. 

0997 

Silt 

0.2532 

0. 

0964 

-0. 

6603 

-0. 

4205 

-0. 

1907 

Clay 

0.6416 

0. 

0231 

-0. 

4217 

-0. 

3842 

0. 

0219 

PHI 

0.4305 

-0. 

1677 

0. 

6459 

0. 

1684 

-0. 

1371 

PH2 

0.5942 

-0. 

1764 

0. 

5754 

0. 

0675 

-0. 

1712 

OC 

0.2859 

0. 

7581 

-0. 

2388 

0. 

3986 

-0. 

0544 

Ca 

0.8269 

0. 

3231 

0. 

2853 

0. 

1766 

0. 

1710 

Mg 

0.8090 

0. 

1242 

0. 

2352 

0. 

0532 

-0. 

0411 

Na 

-0.1085 

0. 

4697 

-0. 

0495 

0. 

5145 

-0. 

2272 

K 

0.6178 

0. 

2221 

-0. 

1573 

-0. 

1801 

0. 

0943 

TB 

0.8668 

0. 

3063 

0. 

2655 

0. 

1490 

0. 

1304 

EXT 

0.0919 

0. 

7915 

-0. 

3943 

0. 

3386 

0. 

0087 

CEC 

0.2515 

0. 

8271 

-0. 

2509 

0. 

3608 

0. 

0581 

*  See  Abbreviations,  pp.  xii-xiii. 

All  underlined  values  had  an  absolute  value  >  0.75. 
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Principal  Component  Analysis  by  Soil  Series 

This  analysis  was  included  to  determine  how  this 
technique  can  be  used  to  select  important  soil  properties 
by  soil  series,  to  evaluate  the  variability  of  similar 
soils,  and  to  evaluate  the  correct  placement  of  pedons 
within  the  soil  classification  system. 

Theoretically,  each  PC  may  explain  approximately  the 
same  proportion  of  the  total  variance  for  similar  soils. 
To  evaluate  this  assumption,  soil  series  with  the  largest 
number  of  observations  were  selected  and  analyzed.  These 
were  the  Albany,  Dothan,  and  Orangeburg  series. 

Results  of  this  analysis  are  presented  in  Figures  10, 
11,  and  12.     The  proportion  of  the  total  variance  explained 
by  the  first  PC  varied  widely.     The  proportion  varied 
between  35.7%  and  71.5%  for  the  Albany  series,  from  30.3% 
to  66.3%  for  the  Dothan  series,  and  from  39.3%  to  82.8%  for 
the  Orangeburg  series.     There  was  a  wide  average  difference 
(38%)  between  the  minimum  and  maximum  values  for  the  three 
soils.     The  degree  of  importance  of  the  soil  properties 
varied  from  one  county  to  another  for  the  same  soil  series. 
For  example,  total  sand  content  was  an  important  property 
to  explain  the  total  variation  of  the  Albany  series  in 
Jackson  and  Leon  Counties  but  was  not  important  in  Santa 
Rosa  County.     Similar  examples  can  be  observed  with  other 
soil  properties  between  different  counties  in  each  soil 
series  selected. 
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A  large  variation  existed  in  the  proportion  of  the 
total  variance  explained  by  the  first  PC  between  counties 
in  each  soil  series.  In  addition,  the  degree  of  importance 
of  each  soil  property  varied  between  counties  in  each  soil 
series.     For  these  reasons,  the  three  soil  series  were 
plotted  in  the  plane  of  the  first  two  PCs  (Figure  13)  to 
visualize  the  relationship  between  pedons  in  each  soil 
series. 

A  large  degree  of  dispersion  was  observed  in  each  soil 
series.     A  clear  grouping  of  pedons  by  individual  soil 
series  did  not  exist.     Thus,  a  nested  analysis  of  the 
variance  was  used  to  created  a  clearer  understanding  of  the 
variation  among  pedons  within  each  series. 

Soil  series,  pedons  within  each  series,  and  horizons 
within  each  pedon  were  considered  as  sources  of  soil 
variation  (Table  13).     Theoretically,  a  larger  variation 
may  occur  between  soil  series  (e.g.,  between  Albany  and 
Dothan)  and  between  horizons  within  pedons  belonging  to  the 
same  series  (e.g.,  between  A  and  B  horizons  in  the  Albany 
series ) . 

A  large  part  of  the  total  variation  was  explained  by 
differences  between  pedons  belonging  to  the  same  series. 
More  than  30%  of  the  variability  in  all  sand  fractions 
(except  total  sand),  silt,  pH-water,  K  content,  and  CEC  was 
explained  by  the  differences  between  pedons  within  the  same 
soil  series. 


106 


e 


o 


o 


<4 

*  O 


-  z 

o 
i 

o  8 


< 

a. 
o 
z 

K 
a 


I 


CD 

-P 

U-J 

0 

0) 

e— 

10 

1— 1 

fcij 

CD 

+j 

c 

■H 

(0 

CD 

• 

-H 

4J 

G 

C 

cn 

CD 

C 

i — i 

0 

■H 

ft 

0 

B 

en 

0 

u 

na 

rc 

o 

CD 

•H 

rH 

U 

CD 

c 

09 

•H 

H 

ft 

0 

o 

c 

0 

*J 

■H 

+J 

+J 

CO 

en 

0 

M 

0 

■H 

2  iN3NOdHOO  IVdlONIUd 


CD 
U 
3 
tJi 
•H 
fa 


107 


Table  13.    Variability  of  studied  soil  properties  within 
and  between  soil  series  and  between  horizons. 


Soil  *   Source  of  Variation  

Property      Soil  Series      Pedon  **      Horizon  Error 
 %  


rTlT  T 

TH 

A  1 

U .  U 

J.J 

Q  0  A 

yz .  4 

VC 

U .  b 

b  4  .  Z 

0.1 

J  Z  .  X 

C 

Z  .  j 

u .  z 

1  a  n 

1  0  .  u 

M 

U  .  U 

Q  "3  "7 

O.J 

a  n 
0  •  u 

F 

Q  "7 
O  .  / 

R  "7  T 

lo  .  Z 

1  ^  a 

VF 

0 .  0 

y  l .  i 

J  .  1 

c;  a 
J  .  o 

TS 

51.8 

0.0 

34.0 

14.2 

Silt 

7.0 

34.6 

21.5 

36.9 

Clay 

36.7 

0.0 

43.1 

20.2 

PHI 

7.2 

36.9 

4.5 

51.4 

PH2 

0.0 

26.0 

9.2 

64.8 

OC 

1.4 

0.0 

95.0 

3.6 

Ca 

12.7 

0.0 

77.5 

9.7 

Mg 

26.7 

0.0 

61.5 

11.8 

Na 

7.0 

0.0 

88.8 

4.2 

K 

0.0 

30.2 

25.6 

44.2 

TB 

5.7 

23.9 

27.9 

42.4 

EXT 

0.0 

0.0 

81.0 

19.0 

CEC 

0.0 

37.3 

48.5 

14.2 

BS 

9.8 

22.4 

23.4 

44.3 

*  See  Abbreviations,  pp.  xii-xiii. 
**  Pedon  within  soil  series. 
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A  large  part  of  the  variability  in  total  sand  and  clay 
contents  was  explained  by  the  differences  between  soil 
series.     More  than  40%  of  the  variability  in  clay,  organic 
carbon,  Ca,  Mg,  and  Na  contents;  extractable  acidity;  and 
CEC  was  explained  by  the  difference  between  horizons.  Some 
soil  properties  (horizon  thickness,  very  coarse  sand  and 
silt  contents,  pH-KCl,  K  content,  total  bases,  and  base 
saturation)  had  a  large  unexplained  variability  (error). 

Total  sand  and  clay  contents  fulfilled  the  initial 
hypothesis  which  stated  that  a  large  part  of  the 
variability  was  explained  by  differences  among  soil  series. 
Organic  carbon  content  also  fulfilled  the  initial 
hypothesis  that  a  large  part  of  the  variability  was 
explained  by  differences  among  soil  horizons  within  similar 
soil  series. 

These  results  validated  the  conclusions  of  the  PCA, 
for  both  standardized  weighted  data  and  standardized  A 
horizon  data.     Total  sand,  clay,  and  organic  carbon 
contents  were  selected  by  the  PCA  as  soil  properties  which 
were  important  in  explaining  the  total  variance.     Fine  sand 
content  and  CEC  were  also  selected  by  PCA,  but  according  to 
the  nested  analysis  of  variance,  a  large  part  of  their 
variability  was  explained  by  the  differences  among  pedons 
within  soil  series.     Therefore,  fine  sand  and  CEC  were  not 
included  in  the  geostatistical  analysis. 
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In  addition,  these  results  also  validated  the  use  of 
a  large  correlation  coefficient  (|0.75|)  because  this 
coefficient  allowed  the  selection  of  those  variables  with 
large  variability  between  soil  series  and  horizons. 

Both  PCA  and  nested  analysis  of  variance  were  very 
useful  in  selecting  important  soil  properties  (total  sand, 
clay,  and  organic  carbon  contents)  for  further  analysis. 
PCA  reduced  the  large  number  of  soil  properties  selected 
initially.     The  nested  analysis  of  variance  demonstrated 
that  most  of  the  soil  properties  selected  by  the  PCA  were 
important  as  differentiating  properties  between  soil  series 
and/or  horizons.     Likewise,  the  selected  soil  properties 
are  important  to  determine  specific  soil  potentials  (e.g., 
fertility  and  irrigation).     Thus,  the  variability  of  the 
selected  soil  properties  affect  the  accuracy  of  the 
predictions  for  these  specific  performances. 

For  a  final  validation,  soils  were  plotted  in  the  plane 
of  the  first  two  PCs,  considering  only  the  important 
selected  soil  properties  (Figure  14).     In  this  a  slightly 
better  grouping  of  soils  by  series  was  observed  compared  to 
Figure  13. 

An  important  conclusion  from  these  analyses  is  that 
because  of  the  multivariate  character  of  soils,  the 
selection  of  variables  must  be  based  on  some  quantitative 
method.     Otherwise  the  biased  selection  of  variables  can 
introduce  a  large  source  of  error  in  the  results. 
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Conversely,  the  use  of  the  complete  set  of  data  would  add 
more  complexity  to  the  analysis. 

A  large  number  of  soil  properties  had  a  large 
proportion  of  the  variability  either  explained  by 
differences  among  pedons  belonging  to  the  same  soil  series 
and/or  unexplained  soil  variability.     It  is  believed  that 
the  possible  causes  of  the  variability  are: 

(i)  Soil  properties  relevant  to  define  series,  such  as 
morphological  properties,  were  not  considered  in  the 
analyses . 

Variability  of  total  sand,  clay,  and  organic  carbon 
contents  was  successfully  explained  by  differences  between 
soil  series  and/or  horizons  because  these  soil  properties 
are  related  to  morphological  properties  of  a  given  horizon. 
Total  sand  content  is  related  to  the  coarse-textured 
surface  horizon,  clay  content  is  related  to  the  argillic 
horizon,  and  organic  carbon  content  is  related  to  the 
surface  A  horizon. 

(ii)  Sampling  errors  by  assuming  an  erroneous  concept 
of  soil  variability.     Sampling  errors  are  introduced  if 
soil  scientists  assume  that  the  sampling  unit  is  completely 
uniform  when  it  is  not  so. 

It  seems  very  difficult  to  have  a  completely  uniform 
sampling  unit.     Variability  has  been  recognized  at  all 
scales.     Soil  variability  has  been  widely  recognized  at 
macroscopic  scale  (Beckett  and  Webster,  1971;  Beckett  and 
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Bie,  1976).     In  addition,  variability  can  be  recognized 
microscopically  and  submicroscopically  (Wilding  and  Drees, 
1978,  1983).     If  soil  variability  is  considered  as  the  sum 
of  variability  at  all  scales,  then,  it  is  very  difficult  to 
have  uniform  soils.     The  reality  is  that  "uniform"  soils 
are  those  in  which  the  internal  variability  ("within" 
variability)  is  lower  than  the  variability  compared  to  the 
surrounding  soils  ("between"  variability). 

(iii)  A  large  source  of  variation  was  introduced 
because  of  lack  of  emphasis  by  soil  scientists  on  soil  and 
landscape  relationships. 

Descriptions  of  the  geomorphic  environment  are  very 
ambiguous  for  some  soil  series.     For  example,  the 
geographic  setting  of  Orangeburg  series  is  described  as 
follows:  Orangeburg  soils  are  on  nearly  level  to  strongly 
sloping  uplands  of  the  Coastal  Plain.     Slopes  range  from  0 
to  20%  (National  Cooperative  Soil  Survey,  1982).  The 
geomorphic  environment  was  described  as  gently  sloping 
uplands  with  4%  gradient  for  an  individual  pedon  of 
Orangeburg  series  (Carlisle  et  al.,  1985;  p.  192). 

Many  soil  investigations  in  the  U.S.  have  involved 
geomorphic  surfaces.     Ruhe  (1969)  defined  a  geomorphic 
surface  as  a  portion  of  the  landscape  specifically  defined 
in  space  and  time.     The  surface  is  a  mappable  unit  that  has 
no  size  limit  and  may  include  a  number  of  landforms  and 
landscapes.     According  to  this  concept  only  time  for  a 
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geomorphic  surface  is  uniform;  other  geomorphic  features 
related  to  space  (i.e.,  physiography)  can  have  large 
variations. 

In  addition,  a  low  degree  of  accuracy  in  the  pedon 
location  descriptions  was  evident  while  locating  the 
selected  pedons  on  topographic  maps.    More  emphasis  has 
been  placed  on  the  descriptive  aspect  of  the  soil  series 
than  in  the  geographical  aspect. 

Importance  of  the  geographical  aspect  of  soil  was 
pointed  out  by  Bie  (1984).     He  indicated  that  the  accurate 
location  of  pedons  by  X  and  Y  coordinates  would  be  a  great 
contribution  to  soil  science. 

.  (iv)  Possible  errors  in  soil  correlation.     The  large 
degree  of  pedon  dispersion  within  individual  soil  series 
may  be  the  result  of  incorrect  placement  of  individual 
pedons  into  the  soil  classification  system.  Soil 
correlation  was  beyond  the  scope  of  this  investigation,  but 
PCA  may  be  a  useful  guantitative  method  to  indicate 
problems  in  soil  correlation. 

Geostatistics 
The  variability  of  soil  properties  is  a  limiting 
factor  for  reliable  soil  interpretations  and  for  making 
accurate  predictions  of  soil  performance  at  any  particular 
location  on  the  landscape. 
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A  large  number  of  studies  have  been  made  to  quantify 
soil  variability,  but  they  have  not  taken  into 
consideration  the  geographic  character  of  soil  variability. 
Conversely,  geostatistical  analysis  is  based  on  the 
geographic  location  of  the  individual  observations. 
Therefore,  geostatistical  techniques  can  offer  a  solution 
to  some  of  the  unsolved  problems  of  spatial  variability  of 
soils . 

The  151  pedons  studied  were  located  by  a  system  of  X 
and  Y  coordinates  (Appendix  C).     Pedons  were  irregularly 
distributed  in  an  approximately  380  x  100  km  grid 
(Figure  15) . 

Geostatistical  analysis  can  be  used  for  horizontal  and 
vertical  directions,  but  using  both  these  directions  adds 
more  complexity  to  the  analysis.     Therefore,  the  data  were 
selected  to  represent  the  variability  of  soils  in  the 
horizontal  plane. 

Two  sets  of  data  were  analyzed.     One  set  was  composed 
by  weighted  average  values  of  total  sand,  clay,  and  organic 
carbon  contents.     The  other  set  of  data  included  clay  and 
organic  carbon  contents  from  the  A  horizon. 
Semi-Varioqrams 

The  first  step  in  the  geostatistical  analysis  was  to 
calculate  the  semi-variance.     The  number  of  pedons  provided 
sufficient  pairs  of  observations  for  reliable  estimates  of 
semi-variograms.     The  total  number  of  pairs  was  calculated 
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from  the  combinatorial  equation: 

Ns  of  Combinations  =n!/n!(n-r)!  (48) 
where  n  =  Total  number  of  pedons 

r  =  Number  of  pedons  taken  at  one  time 

When  r  =  2,  equation  (48)  reduces  to 

Ns  of  Pairs  =  n  (n  -  1)  /  2  (49) 

According  to  equation  (49),  151  pedons  provided  11325 
pairs.     Direction-independent  and  direction-dependent  semi- 
variances  were  calculated  for  each  soil  property  studied. 
Semi-variances  for  direction-independent  and  E-W,  NE-SW, 
NW-SE,  and  N-S  directions  were  supported  by  11,325;  8,298; 
924;  1,450;  and  653  pairs  of  observations,  respectively. 

A  reliable  semi-variogram  is  obtained  when  intervals 
are  chosen  such  that  the  number  of  pairs  is  large  enough  to 
ensure  accurate  definition  of  each  point  on  the  semi- 
variogram.     A  rule  of  thumb  is  to  use  intervals  such  that 
the  minimum  number  of  pairs  of  observations  in  each 
interval  is  about  50  (Skrivan  and  Karlinger,  1979). 
Likewise,  the  maximum  lag  distance  to  provide  reliable 
semi-variograms  is  a  half  of  the  total  length  (Journel  and 
Huijbregts,  1978).     The  total  length  was  approximately  380 
km  in  the  E-W  direction  and  100  km  in  the  N-S  direction.  A 
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lag  distance  of  10  km  was  selected  to  calculate  the  semi- 
variance  up  to  190  km  in  the  E-W  direction  and  50  km  in  the 
N-S  direction. 

Stationarity  is  an  important  assumption  to  consider  in 
geostatistics .     The  criterion  used  to  determine  the 
validity  of  this  assumption  was  explained  by  Journel  and 
Huijbregts  (1978).     They    indicated  that  when  the  semi- 
variance  increase  is  larger  than  |h2|   ( |h|  =  modulus  of  the 
lag  distance) ,  for  large  distances  h,  the  increase  is 
incompatible  with  the  intrinsic  hypothesis.     Such  an 
increase  in  the  semi-variance  often  indicates  the  presence 
of  drift. 

Statisticians  established  the  constraint  of 
stationarity  because  each  sample  was  considered  unigue  when 
geostatistics  was  developed.     Conseguently ,  statistical 
inference  about  the  population  could  not  be  made. 

Geostatistics  was  developed  in  the  mining  industry. 
Sampling  procedure  in  mining  is  guite  different  from 
sampling  procedures  applied  in  soils.     Sampling  ore 
deposits  involves  large  volumes  of  individual  samples, 
large  sampling  time,  and  high  costs.     It  is  very  difficult 
to  take  sample  replications  in  mining. 

Sampling  soils  is  a  completely  different  situation. 
Most  soil  samples  are  taken  within  2  m  from  the  soil 
surface.     In  addition,  soil  samples  can  be  taken  at 
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distances  varying  from  a  few  cm  to  several  km  apart  at 
relatively  low  cost. 

Soil  stationarity  was  assumed  before  geostatistics 
could  be  used  in  soil  science.     Soil  scientists  have 
assumed  stationarity  when  they  take  replications  to 
increase  the  precision  of  the  results.     Stationarity  of 
soils  is  assumed  when  the  placement  of  soils  in  the 
classification  system  is  tested.     Stationarity  of  soils  has 
been  also  implicitly  assumed  when  a  map  unit  is  delineated 
by  a  soil  survey.     Observations  are  not  unique  in  soil 
science.     It  is  possible  to  take  relatively  homogeneous 
replications  of  soil  samples. 

Stationarity  is  not  as  serious  a  problem  in  soils  as 
it  is  in  mining.     Therefore,  the  criterion  to  determine 
soil  stationarity  needs  to  be  defined.     Stationarity  is 
important  within  the  area  in  which  a  large  degree  of 
similarity  and  dependence  in  soil  property  values  exits. 
The  similarity  and  dependence  of  soil  properties  values  are 
large  within  a  map  unit.     The  degree  of  dependence 
decreases  when  soil  properties  are  measured  in  different 
map  units  up  to  the  point  in  which  soil  properties  values 
are  no  longer  related. 

The  within-unit  (WU)  variability  and  the  between-unit 
(BU)  variability  are  important  in  order  to  know  the  degree 
of  uniformity  of  map  units,  of  individual  pedons,  or  of 
soil  properties.     The  variability  WU  is  expected  to  be 
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smaller  than  the  variability  BU.     Although,  where  different 
levels  of  management  have  been  applied,  WU  variability  may 
exceed  the  BU  variability  (Beckett  and  Webster,  1971; 
McCormack  and  Wilding,  1969).     In  general,  the  WU 
variability  gives  us  the  degree  of  uniformity  or 
variability  of  a  map  unit  or  individual  soil  properties. 
The  WU  variance  could  then  be  used  as  a  criterion  to 
establish  stationarity  for  those  soil  properties  less 
affected  by  management  (e.g.,  total  sand  and  clay 
contents).    When  twice  the  value  of  semi-variance  (G)  is 
larger  than  the  WU  variance  ( 2G  =  variance)  in  the  area  in 
which  the  soil  properties  are  supposed  to  be  related,  then, 
stationarity  is  absent. 

Data  were  grouped  by  soil  series.     The  WU  variability 
was  represented  by  the  WU  variance  of  the  soil  series. 
Total  sand  and  clay  contents  had  a  WU  variance  of  33.8  and 
14.3  respectively.     The  first  semi-variogram  for  total  sand 
content  (Figure  16)  had  a  G  value  that  increased  from  198.4 
at  5  km  distance  to  249.1  at  15  km  distance.     The  increase 
in  distance  represented  an  increase  of  2  in  the  modulus  of 
the  lag  distance.     The  semi-variogram  for  clay  content  had 
a  G  value  that  increased  from  153.5  at  5  km  distance  to 
180.1  at  15  km  distance  (Figure  17).       Therefore,  semi- 
variograms  of  weighted  average  total  sand  and  clay  contents 
had  drift.     If  the  information  contained  in  the  semi- 
variogram  is  to  be  used  for  making  optimal  unbiased 
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estimates  (kriging)  of  the  selected  soil  properties  at 
unsampled  locations,  the  drift  must  be  removed.  An 
iterative  procedure,  explained  in  the  Materials  and  Methods 
section,  was  used  to  remove  the  drift. 

The  observed  drift  in  total  sand  and  clay  content 
semi-variograms  was  reduced,  but  it  was  not  completely 
removed.     A  reason  for  this  may  be  the  presence  of  a  short 
range  variability  in  the  soil  properties.     Pedons  with 
large  differences  in  soil  properties  were  located  at  short 
distances.     Different  pedons  were  compared  when  semi- 
variograms  were  calculated  with  lag  increments  of  10  km. 

These  results  are  supported  by  previous  works. 
Burrough  (1983a)  pointed  out  that  it  seems  impossible  to 
achieve  stationarity.     Olea  (1975)  could  not  eliminate  the 
drift  in  the  data;  then,  he  used  universal  kriging  to 
produce  maps  that  indicated  trends  in  data  variability. 

Total  sand  and  clay  content  semi-variograms  were 
characterized  by  the  presence  of  structure.  Semi-variogram 
structure  occurs  when  there  is  an  increase  of  the  semi- 
variance  to  a  maximum  value  (Figures  18,  19,  20,  and  21). 
These  semi-variograms  had  characteristics  nugget  variances 
(intercept),  ranges,  and  sills  (Table  14). 

Theoretically,  the  semi-variogram  should  pass  through 
the  origin  when  the  distance  h  =  0  (h  is  the  lag  distance). 
However,  total  sand  and  clay  contents  had  non-zero  semi- 
variances  as  h  decrease  to  zero.    This  is  called  nugget 
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Table  14.     Important  semi-variogram  parameters  of  the 

weighted  average  of  selected  soil  properties. 


Semi-variogram 

Range 
(km) 

Sill 

Nugget 
variance 

g. 

0 

* 

Total  sand 

content 

Direction- independent 

34.7 

324.0 

173.6 

53 . 

6 

East-West 

24.8 

295.8 

160.1 

54. 

1 

Northeast-Southwest 

34.7 

292.4 

182.7 

62. 

5 

Northwest-Southeast 

25.4 

287.3 

220.8 

76. 

9 

North-South 

30.0 

352.5 

120.5 

34. 

2 

Clay  content 


Direction- independent 

34 

.7 

230.2 

135.9 

59. 

0 

East-West 

34 

.6 

211.3 

123.2 

58. 

3 

Northeast-Southwest 

34 

.7 

206.5 

143.3 

69. 

4 

Northwest-Southeast 

15 

.6 

210.3 

153.2 

72. 

9 

North-South 

34 

.7 

299.5 

99.7 

33. 

3 

Organic  < 

carbon 

content 

Direction-independent 

<10 

.0 

0.120 

0.120 

100. 

0 

East-West 

<10 

.0 

0.119 

0.119 

100. 

0 

Northeast-Southwest 

<10 

.0 

0.069 

0.069 

100. 

0 

Northwest-Southeast 

<10 

.0 

0.052 

0.052 

100. 

0 

North-South 

<10 

.0 

0.201 

0.201 

100. 

0 

*  %  of  the  sill  represented  by  the  nugget  variance 
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variance  or  nugget  effect  (Journel  and  Huijbregts,  1978). 
Nugget  variance  represents  the  unexplained  variance,  often 
caused  by  measurement  error  or  variability  of  the  soil 
properties  that  could  not  be  identified  at  the  scale 
employed.     The  intercept,  which  is  the  estimate  of  G  at 
h  =  0,  provided  an  indication  of  the  variation  at  a 
distance  shorter  than  10  km. 

The  range  of  the  semi-variogram  is  the  distance  at 
which  G  attains  the  maximum  value  (sill).     The  range  can  be 
interpreted  as  the  diameter  of  the  zone  of  influence  which 
represents  the  average  maximum  distance  over  which 
observations  are  related.     They  are  dependent.     At  a 
distance  larger  than  the  range,  observations  are  no  longer 
related.    They  are  independent. 

At  distances  less  than  the  range,  measured  properties 
(e.g.,  total  sand  and  clay  contents)  of  two  samples  become 
more  alike  with  decreasing  distance  between  them.  Thus, 
the  range  provides  an  estimate  of  the  areas  of  similarity. 
The  range  also  represents  the  average  minimum  distance  at 
which  maximum  variation  occurs. 

The  maximum  semi-variance  value  is  called  the  sill. 
The  sill  is  egual  to  the  sum  of  the  nugget  variance  and  the 
spatial  covariance  (Co  +  C).  Often,  the  sill  is 
approximately  egual  to  the  sample  variance  (Journel  and 
Huijbregts,  1978). 
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Total  sand  and  clay  content  semi-variograms  were 
anisotropic,  indicating  that  the  variability  of  selected 
soil  properties  changed  with  direction  (Table  14).  The 
longest  range  was  of  approximately  35  km  for  both  total 
sand  and  clay  contents.     The  longest  range  was  for 
direction-independent  and  NE-SW  semi-variograms  for  total 
sand  content;  and  for  direction-independent,  NE-SW,  and  N-S 
semi-variograms  for  clay  content.     The  largest  variation 
(sill)  occurred  in  the  N-S  direction  for  both  total  sand 
and  clay  contents.     The  largest  proportion  of  the 
unexplained  variation  occurred  in  the  NW-SE  direction  for 
both  total  sand  and  clay  contents.     Differences  between 
direction-dependent  semi-variograms  for  the  soil  properties 
selected  by  the  PCA  could  be  the  result  of    differences  in 
geology  and  topography. 

Organic  carbon  content  is  a  soil  property  influenced 
by  management.     The  WU  variance  was  smaller  than  the  BU 
variance  (0.0  and  0.07  respectively).     Organic  carbon 
content  had  very  low  WU  and  BU  variances  because  the 
largest  variation  occurred  between  horizons  (Table  13). 

Stationarity  in  organic  carbon  values  was  present  when 
the  wu  variance  was  used  to  determine  stationarity;  but 
stationarity  was  absent  if  the  semi-variance  increment 
compared  to  the  lag  distance  increment  was  used.  This 
situation  resulted  because  semi-variance  organic  carbon 
values  were  very  small  (generally  less  than  one), 
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therefore,  any  increment  in  lag  distance  always  resulted  in 
the  absence  of  drift.     The  absence  of  drift,  when  lag 
distance  was  used  as  the  criterion,  was  a  problem  related 
to  the  measurement  scale.     This  was  supported  by  the  fact 
that  organic  carbon  had  the  largest  C.V.   (Table  2)  among 
the  three  soil  properties  used  for  the  geostatistical 
analysis.     Consequently,  the  WU  variance  was  used  as  the 
criterion  to  determine  stationarity . 

Semi-variogram  for  organic  carbon  content  did  not  have 
any  structure  (Figures  22  and  23);  there  was  no  increase 
to  a  maximum  value.     A  pure  nugget  effect  was  observed, 
indicating  a  short-range  variability  in  organic  carbon 
content.     Organic  carbon  content  had  a  large  point  to  point 
variation  at  short  distances  of  separation  and  an  absence 
of  spatial  correlation  at  the  scale  used.     The  range  of  the 
organic  carbon  content  semi-variogram  was  a  distance 
smaller  than  10  km. 

Direction-dependent  semi-variograms  (Figure  23)  showed 
an  anisotropic  variation  in  organic  carbon  content.  The 
largest  variation  ocurred  in  the  N-S  direction  (Table  14). 
The  anisotropic  variation  of  organic  carbon  indicated  that 
the  factors  which  influence  the  organic  carbon  content 
(e.g.,  vegetation,  moisture,  drainage,  relief,  management) 
are  different  in  different  directions  with  the  largest 
variability  in  the  N-S  direction. 
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Semi-variograms  of  A  horizon  soil  properties  (clay  and 
organic  carbon  contents)  also  indicated  presence  of  drift. 
The  WU  variances  were  used  as  criterion  to  determine 
stationarity .    WU  variances  were  10.0  and  0.0  for  the  A 
horizon  clay  and  organic  carbon  contents,  respectively. 

A  large  part  of  the  drift  was  removed  for  semi- 
variograms  of  the  A  horizon  soil  properties  by  using  the 
residuals,  but  it  was  not  completely  removed.     A  reason  for 
this  can  be  related  to  the  presence  of  different  pedons 
within  short  distances. 

Semi-variograms  of  the  A  horizon  clay  content 
indicated  presence  of  structure  (Figures  24  and  25).  The 
maximum  variance  (sill)  was  reached  within  distances 
varying  from  20  to  35  km  (Table  15).     The  maximum  variation 
occurred  in  the  NW-SE  direction.    Variation  of  the  A 
horizon  clay  content  was  smaller  than  the  variation  of  the 
weighted  average  clay  content.     This  was  due  to  the  fact 
that  weighted  average  data  included  contrasting  horizons  in 
clay  content  such  as  A  and  B  horizons. 

The  direction  of  maximum  variation  of  the  A  horizon 
clay  content  corresponded  to  the  direction  in  which  the 
weighted  average  clay  content  had  the  largest  nugget 
variance.     Therefore,  it  is  probable  that  the  large 
variation  in  the  A  horizon  clay  content  in  the  NW-SE 
direction  was  one  of  the  causes  of  the  large  unexplained 
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Table  15.     Important  semi-variogram  parameters  of  the 
A  horizon  selected  properties. 


Semi-variogram 

Range 

Sill 

Nugget 

o 
"O 

* 

(1cm) 

variance 

Clay  content 

Direction-independent 

34.8 

75.8 

24.4 

32. 

2 

East-West 

45.0 

96.9 

25.8 

26. 

6 

Northeast-Southwest 

25.0 

55.6 

28.6 

51. 

4 

Northwest-Southeast 

35.1 

118.0 

7.5 

6. 

4 

North-South 

20.1 

53.0 

22.3 

42. 

1 

Organic  carbon  content 

Direction-dependent 

<10.0 

1.048 

1.048 

100. 

0 

East-West 

<10.0 

1.045 

1.045 

100. 

0 

Northeast-Southwest 

<10.0 

1.014 

1.014 

100. 

0 

Northwest-Southeast 

<10.0 

1.089 

1.089 

100. 

0 

North-South 

<10.0 

0.955 

0.955 

100. 

0 

*    %  of  the  sill  represented  by  the  nugget  variance. 
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variation  in  the  same  direction  for  the  weighted  average 
clay  content. 

Semi-variogram  of  A  horizon  organic  carbon  indicated, 
as  did  the  semi-variogram  of  weighted  average  organic 
carbon  content,  a  pure  nugget  effect  (Figures  26  and  27). 
A  reason  for  this  is  that  the  A  horizon  organic  carbon 
content  had  a  large  point  to  point  variation  at  short 
distances.    Variation  of  the  A  horizon  organic  carbon 
content  was  larger  than  that  for  the  weighted  average 
organic  carbon  content.     This  could  be  due  to  the  fact  that 
some  of  the  A  horizons  were  affected  by  management 
conditions  (Ap)  and  other  A  horizons  were  under  relatively 
natural  conditions  (Al).     This  result  seems  to  be  in 
contradiction  with  results  obtained  with  the  nested 
analysis  of  variance  (Table  13),  but  two  aspects  need  to  be 
considered.     First,  the  nested  analysis  of  variance  did  not 
considered  the  pedon  location.     Second,  the  nested  analysis 
of  variance  included  surface  and  subsurface  A  horizons. 
Therefore,  a  masking  of  the  differences  between  surface  A 
horizons  (Ap  and  Al)  could  have  occurred  when  the  nested 
analysis  of  variance  was  used. 

All  observed  semi-variograms  had  a  characteristic  wave 
pattern,  indicating  a  cyclic  variation  in  the  studied  soil 
properties . 
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Fitting  Semi-Variograms 

The  process  of  fitting  the  observed  semi-variogram  to 
a  theoretical  model  is  another  important  step  in  the 
geostatistical  analysis.     It  is  important  to  choose  the 
appropriate  model  for  estimating  the  semi-variogram  because 
each  model  yields  quite  different  values  for  the  nugget 
variance  and  range,  both  of  which  are  critical  parameters 
for  kriging. 

The  process  of  fitting  observed  semi-variograms  to 
theoretical  models  was  time  consuming.  Therefore, 
direction-independent  and  direction-dependent  semi- 
variograms  with  the  largest  variation  were  selected.  Olea 
(1984)  stated  that  there  is  no  single  solution  to  curve 
fitting.     The  user  must  decide  what  part  of  the  semi- 
variogram  should  be  fitted  and  what  part  should  be  regarded 
as  anomalous. 

Points  located  within  distances  varying  from  0  to  50  km 
were  selected  because  the  range  was  included  and  there  was  a 
large  semi-variogram  reliability  within  these  distances. 
The  choice  of  the  model  was  governed  by  the  general  graphic 
appearance  of  the  observed  semi-variogram. 

The  curve  fitting  procedure  described  in  the  Material 
and  Methods  section  was  used.     The  objective  of  the  fitting 
procedure  was  to  adjust  the  parameters  in  the  semi- 
variogram  until  the  model  was  theoretically  consistent. 
Consistency  is  reached  when  the  kriged  average  error  (KAE) 
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is  approximately  zero  and  the  kriged  reduced  mean  square 
error  (KRMSE)  is  approximately  equal  to  one.     KAE  and  KRMSE 
are  defined  by  equations  (47)  and  (48). 

KAE  gives  the  average  of  the  difference  between  the 
observed  and  the  theoretical  (estimated)  values,  KRMSE 
represents  the  ratio  between  the  theoretical  and  the 
calculated  variance  (sill). 

Models  selected  had  KAE  and  KRMSE  values  very  close  to 
zero  and  one,  respectively.     Kriged  mean  square  errors 
(KMSE)  were  computed  according  to  the  following  equation: 

KMSE  =  [1/n  (Zj_  -  Zi)2    ]1/2  (52) 

where  n  =  number  of  points 
Z^=  measured  value 

Zi=  kriged  value 

KMSE  gives  an  idea  of  the  dispersion  of  the  measured  values 
respect  to  the  kriged  values. 

Because  of  these  values  (Table  16),  direction- 
independent  semi-variograms  for  weighted  average  values  of 
total  sand  and  clay  contents  were  fitted  by  the  DeWijsian 
(logarithmic)  model  (Figures  18  and  20).     Total  sand  and 
clay  content  N-S  semi-variograms  were  fitted  by  the 
Spherical  model  (Figures  35  and  36).     Organic  carbon 
content  direction-independent  and  N-S  semi-variograms  were 
fitted  by  the  Linear  model  (Figures  22  and  37). 
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Table  16.     Goodness-of -f it  values  of  the  weighted  average 
of  selected  soil  properties. 


Semi-variogram 

Model  KAE* 

KMSE+ 

KRMSE** 

Total  sand  content 

Direction- 
independent 

DeWijsian  -0.0589 

16.8347 

1.0670 

North-South 

Spherical  -0.0699 

16.7292 

1.3826 

Clay  content 

Direction- 
independent 

DeWijsian  0.0666 

13.7246 

1.0001 

North-South 

Spherical  0.0363 

13.6306 

1.1479 

Organic  carbon  content 

Direction- 
independent 

Linear  -0.0001 

0.3754 

1.0535 

North-South 

Linear  0.0004 

0.3778 

0.8321 

* 
+ 

*  * 


KAE  =  Kriged  Average  Error 

KMSE  =  Kriged  Mean  Sguare  Error 

KRMSE  =  Kriged  Reduced  Mean  Square  Error 
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The  KAE  and  KRMSE  values  for  the  A  horizon  soil 
properties  (Table  17)  indicated  that  direction-independent 
and  the  NW-SE  semi-variograms  for  clay  content  were  fitted 
by  the  Spherical  and  Root  models,  respectively  (Figures  24 
and  38).     The  A  horizon  direction-independent  and  the  NW- 
SE  semi-variograms  for  organic  carbon  content  were  both 
fitted  by  the  Linear  model  (Figures  26  and  39). 
Kriging 

One  of  the  prime  reasons  for  obtaining  a  semi- 
variogram  is  to  use  it  for  estimation.     Soil  survey 
recognizes  two  main  kinds  of  estimates  (Webster,  1985). 
One  is  the  average  value  of  a  soil  property  within  some 
defined  region.     The  other  is  the  prediction  of  values  of  a 
property  at  unsampled  places  (interpolation). 

The  information  derived  from  the  fitted  semi- 
variograms  was  used  to  generate  contour  maps  of  kriged 
values  of  soil  properties  (interpolated  values). 
Contour  maps  were  produced  by  using  universal  kriging 
because  trends  were  present  in  the  data. 

Kriging  is  a  technigue  of  making  optimal,  unbiased 
estimates  of  regionalized  variables  at  unsampled  locations 
using  the  information  contained  in  the  semi-variogram 
(range,  nugget  effect,  theoretical  model).     Kriging  is 
optimal  because  it  reduces  the  estimation  variance  and  is 
unbiased  because  KAE  is  zero. 
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Table  17.     Goodness-of -f it  values  for  the  A  horizon 
selected  properties. 


Semi-variogram 

Model 

KAE* 

KMSE+ 

KRMSE** 

Clay  content 

Direction- 
independent 

Spherical 

0.1071 

6.3258 

1.0175 

Northwest- 
Southeast 

Root 

0.2497 

8.1069 

1.4189 

Organic  carbon 

content 

Direction- 
independent 

Linear 

0.0000 

1.0212 

0.9883 

Northwest- 

Linear 

■0.0003 

1.0336 

1.0017 

Southeast 


* 
+ 

** 


KAE  =  Kriged  Average  Error 

KMSE  =  Kriged  Mean  Sguare  Error 

KRMSE  =  Kriged  Reduced  Mean  Sguare  Error 
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Contour  maps  were  generated  for  total  sand  and  clay 
(weighted  average  and  A  horizon)  contents  derived  from  the 
direction-independent  and  direction-dependent  with  largest 
variability  semi-variograms  (Figures  28,  29,  and  30). 
Contour  maps  were  better  interpreted  when  used  with 
diagrams.     Organic  carbon  content  (weighted  average  and  A 
horizon  values)  was  not  used  for  contouring  maps  because  of 
the  large  nugget  variance  that  produced  large  variance 
estimates  influencing  the  reliability  of  the  map. 

Areas  with  discontinuous  contour  lines  (i.e.,  lower 
left  hand  side  and  upper  right  hand  side  of  the  Figures 
used)  corresponded  to  zones  with  no  sampled  pedons  (Figure 
15).     Contour  maps  were  influenced  by  the  nugget  and  the 
minimum  variance  criterion  (within-unit  variance).  The 
latter  ensured  that  the  interpolated  value  at  a  sampling 
point  was  the  observed  value  there.     The  presence  of  a 
nugget  variance  indicated  that  the  semi-variogram  was 
composed  by  two  functions  (except  organic  carbon),  one 
describing  the  spatial  dependence,  and  the  other  a  purely 
random  variation  that  influenced  the  boundary  between 
delineations. 

Clay  content  weighted  average  had  a  nugget  variance 
larger  in  proportion  to  the  sill  than  that  for  total  sand 
content  (Table  14).     This  situation  could  indicate  that  the 
map  of  of  kriged  values  of  clay  content  weighted  average 
encompassed  more  variable  units  than  the  map  of  kriged 
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values  total  sand  content.     In  general,  there  was  a  very 
good  correspondence  between  the  two  maps.     Because  both 
soil  properties  were  important  components  of  the  same  PC, 
in  addition  the  silt  content  was  very  low.     Another  aspect 
is  that  PCA  not  only  allowed  the  selection  of  important 
soil  properties  for  separating  different  soil 
series,  but  also  the  soil  properties  selected  were 
spatially  related. 

An  important  objective  was  to  find  a  physical  meaning 
of  the  contour  maps  and  diagrams  generated  by  the 
geostatistical  analysis.     For  this  reason  contour  maps  and 
diagrams  were  compared  with  the  map  of  physiographic 
regions  of  Florida  (Figure  43).     All  contour  maps  have  X 
and  Y  axes  represented  by  the  values  of  the  geographic 
coordinates,  which  were  very  useful  in  locating  the 
physiographic  regions. 

Pedons  studied  were  located  in  five  physiographic 
regions:  Southern  Pine  Hills,  located  between  the  0  and  29 
X  coordinates  and  the  8  and  22  Y  coordinates;  Dougherty 
Karst,  located  between  the  29  and  49  X  coordinates  and  the 
8  and  22  Y  coordinates;  Apalachicola  Delta,  located  between 
the  24  and  58  X  coordinates  and  the  0  and  12  Y  coordinates; 
Tifton  Uplands,  located  between  the  48  and  58  X  coordinates 
and  the  11  and  15  Y  coordinates;  and  Ocala  Uplift,  located 
between  the  58  and  80  X  coordinates  and  the  0  and  15  Y 
coordinates . 
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In  general,  maximum  total  sand  content  corresponded  to 
minimum  clay  content  (Figures  28  and  29).     These  soil 
properties  were  naturally  related  because  of  the  presence 
of  argillic  horizons.     Total  sand  content  had  a  large 
depression  between  the  47  and  54  X  coordinates,  which 
corresponded  to  an  area  among  the  Apalachicola  Delta,  the 
Dougherty  Karst,  and  the  Tifton  Uplands  physiographic 
regions.     The  total  sand  content  diagram  indicated 
predominance  of  high  values  across  the  Panhandle,  but  there 
was  a  break  in  the  continuity  of  the  high  values  because  of 
the  presence  of  the  Apalachicola  Delta. 

Diagrams  of  total  sand  and  clay  contents  (Figures  28 
and  29)  also  indicated  large  number  of  small  depressions 
between  the  22  and  45  X  coordinates  and  the  58  and  70  X 
coordinates,  these  areas  corresponded  to  the  Dougherty 
Karst  and  the  Ocala  Uplift,  respectively.     The  presence  of 
depressions  seems  to  indicate  a  large  variability  in  total 
sand  and  clay  contents.     The  A  horizon  clay  content 
(Figure  30)  also  had  large  variability  in  the  Dougherty 
Karst  physiographic  region.     The  variability  in  soil 
properties  in  the  Dougherty  Karst  is  apparently  related  to 
the  large  variability  in  the  karst  topography.  The 
variability  in  the  Ocala  Uplift  can  be  the  result  of  local 
differences  in  geology  and  topography  including  karst. 
Diagrams  of  clay  content  (Figures  29  and  30)  indicated  that 
clay  content  tend  to  decrease  from  the  north  to  the  south. 
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Contour  maps  derived  from  direction-dependent  semi- 
variograms  with  maximum  variation  (Figures  40,  41,  and  42) 
did  not  differ  from  those  maps  derived  using  direction- 
independent  semi-variograms.     This  fact  indicated  that  the 
geostatistical  program  was  not  capable  of  generating 
contour  maps  derived  from  direction-dependent  semi- 
variograms.     A  reason  for  this  is  that  the  kriging 
subroutine  of  the  geostatistical  program  did  not  reguire 
the  direction  of  the  semi-variogram  as  input  data. 
Therefore,  improvement  of  the  geostatistical  program  to 
take  into  account  the  direction-dependent  semi-variograms 
is  recommended.     Despite  the  fact  that  the  program  was  not 
capable  of  generating  contour  maps  derived  from  direction- 
dependent  semi-variograms,  one  guestion  that  needs  to  be 
answered  is:  Are  there  significant  differences  among  the 
direction-dependent  variances?    If  there  are  no  significant 
differences  among  direction-dependent  variances,  there 
should  be  no  differences  between  contour  maps  derived  from 
direction- independent  and  direction-dependent  semi- 
variograms  for  individual  soil  properties. 

An  important  advantage  of  kriging  was  that  this 
interpolation  technigue  provided  estimates  of  the 
estimation  variance  for  each  observation.     These  estimates 
can  be  displayed  in  the  form  of  standard  error 
(reliability)  maps  or  diagrams.     Reliability  estimates 
indicate  the  precision  of  the  kriged  values  and  alternately 
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could  indicate  where  more  samples  would  provide  more 
information. 

Reliability  diagrams  (Figures  31,  32  ,  and  33)  were 
produced  based  on  the  standard  errors  of  the  kriged  values 
The  standard  errors  varied  from  14.95  to  21.11  for  total 
sand  and  from  13.10  to  16.33  for  clay  content  weighted 
averages,  and  from  4.54  to  6.05  for  A  horizon  clay  content 
The  standard  error  is  a  function  of  the  nugget  variance. 
The  larger  the  nugget  variance  compared  to  the  sill  value, 
the  larger  the  standard  error  compared  to  the  kriged  value 
Clay  content  weighted  average  had  the  largest  standard 
error  and  A  horizon  clay  content  had  the  smallest. 

Reliability  diagrams  indicated  areas  of  large  and 
small  standard  error.     Generally,  areas  with  the  smallest 
standard  errors  corresponded  to  areas  with  no  sampled 
pedons  (Figure  15).     All  three  reliability  diagrams 
coincided  in  indicating  the  eastern  part  of  the  Dougherty 
Karst  physiographic  region  as  the  area  with  largest 
standard  error.     These  reliability  diagrams  can  be  very 
useful  in  the  design  of  new  sampling  strategies.  Areas 
with  large  standard  error  require  an  increase  in  sampling 
intensity  to  increase  the  precision  of  the  estimates. 

Geostatistical  techniques  were  useful  in  evaluating 
the  spatial  variability  of  soils  and  to  indicate  zones 
where  more  intensive  sampling  is  required.  Geostatistical 
techniques  require  more  investigations  in  order  to  better 
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define  the  assumption  of  stationarity  and  anisotropy. 
Anisotropy  indicated  that  the  variability  of  soil 
properties  was  direction-dependent.     But  an  important 
question  to  be  answered  is:  is  there  a  significant 
difference  among  the  direction-dependent  variances? 
Fractals 

Soil  variation  can  also  be  a  function  of  the  scale  of 
observation.     The  components  of  the  variance  measure  the 
amount  of  variance  contributed  by  each  scale,  and  by 
accumulating  them,  it  may  possible  to  show  how  variance 
increases  with  increasing  distance. 

The  ranges  of  the  semi-variograms  studied  varied  from 
"15  to  35  km.     The  random  variation  corresponded  almost 
always  to  a  large  proportion  of  the  total  variance  within 
these  range  distances  (Tables  14  and  15).     Therefore,  the 
objective  of  this  section  was  to  evaluate  quantitatively 
how  semi-variograms  can  be  used  to  indicate  the  scale 
dependence  of  soil  variability  in  the  study  area. 

All  semi-variograms  had  a  nugget  component  which 
represented  the  random  variability.     Organic  carbon  content 
(weighted  average  and  A  horizon  values)  had  a  pure  nugget 
effect  indicating  a  short-range  variation.     Other  soil 
properties  studied  had  long-range  variation.  Soil 
properties  with  long-range  variation  had  nugget  variances 
that  varied  from  approximately  6%  to  about  73%  of  the  total 
variance  (Tables  14  and  15). 
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The  Hausdorf f -Besicovitch  dimension  or  fractal 
dimension  (D)  was  calculated  according  to  equation  (43),  to 
determine  if  the  scale  of  study  was  appropriated  to  resolve 
the  random  component  in  the  variability  of  the  soil 
properties  studied.     Theoretically,  the  D  value  ranges  from 
1  to  2.     The  value  of  1  indicates  a  systematic  variation  of 
soil  properties  and  also  indicates  an  appropriate  scale  of 
study.     A  value  of  2  indicates  a  random  variation  and  the 
need  of  increasing  the  scale  of  the  study.     If  soil 
variability  is  scale-dependent,  a  decrease  in  the  D  value 
is  expected  when  the  distance  between  sampling  points  is 
decreased.     For  this  reason,  fractal  dimensions  were 
computed  from  the  semi-variograms  studied  using  a  lag 
distances  of  10  and  5  km  (Table  18). 

Organic  carbon  content  (weighted  average  and  A  horizon 
values)  had  a  D  value  of  2  in  all  directions  and  for  both 
lag  distances.     This  result  was  supported  by  the  semi- 
variogram  analysis  which  indicated  a  pure  nugget  effect. 
In  addition,  organic  carbon  content  had  also  the  largest 
C.V.   (Table  2).     These  results  indicated  the  short-range 
variation  in  organic  carbon  content.     The  variation  in 
organic  carbon  content  was  random  at  the  scale  used,  thus, 
an  increase  in  the  scale  of  study  used  is  recommended  to 
explain  the  variability  of  this  soil  property. 
Weighted  average  total  sand  content  had  D  values  that 
varied  from  1.80  to  1.92  (Table  18)  for  the  10  km  lag 
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Table  18.  Fractal  dimension  (D  value)  derived  from 
selected  soil  property  semi-variograms . 


Soil 

Property  * 

Semi-variogram 

Lag 
distance 
(km) 

TS 

CL 

or 
D  value 

A-PT 

A-OC 

Direction- 
independent 

10 

1 . 87 

1.90 

2.00 

1.77 

2.00 

5 

1 . 84 

1.72 

2.00 

1.74 

2.00 

E-w 

10 

1 . 87 

1 . 90 

2.00 

1.76 

2.00 

r 

5 

1.86 

1 . 77 

2.00 

1.85 

2.00 

XTT?  CT.T 

NE-SW 

10 

1 . 90 

1 .93 

2.00 

1.86 

2.  00 

5 

1 . 88 

1 .73 

2.00 

1.80 

2.00 

WW— CTT 

i .  y  z 

1  Oft 

2.00 

1.63 

2.00 

5 

1.92 

1.80 

2.00 

1.66 

2.00 

N-S 

10 

1.80 

1.82 

2.00 

1.84 

2.00 

5 

1.66 

1.51 

2.00 

1.77 

2.00 

*  See  abbreviations,  pp.  xii-xiii 
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distance.     The  high  D  values  indicated  that  is  necessary  to 
increase  the  scale  of  the  study  to  explain  better  the 
variability  and  to  reduce  the  random  component  in  the 
variability  of  the  total  sand  content.     The  D  values  of 
total  sand  content  calculated  for  a  lag  distance  of  5  km 
were  always  smaller  than  or  egual  to  the  D  values 
calculated  for  the  10  km  lag  distance.     Therefore,  it  can 
be  concluded  that  the  variability  in  total  sand  content  is 
scale-dependent.     The  decrease  in  the  D  value  was  a 
function  of  the  proportion  of  the  nugget  variance  (random 
variability)  present.     There  was  no  decrease  for  the 
direction  NW-SE  which  had  the  largest  proportion  of  nugget 
variance  (Table  14).     This  fact  may  indicate  a  complex 
variability  in  the  total  sand  content  in  the  NW-SE 
direction  because  of  a  complex  variability  in  geology  and 
topography  in  this  directions.     It  is  necessary  to  increase 
the  scale  of  the  study  not  only  to  reduce  the  random 
variability  but  also  to  find  a  physical  meaning  to  the 
variability. 

Fractal  analysis  of  weighted  average  clay  content 
(Table  18)  gave  similar  results  as  the  analysis  of  weighted 
average  total  sand  content. 

The  D  values  calculated  for  the  A  horizon  clay  content 
were  almost  always  smaller  than  the  D  values  calculated  for 
other  soil  properties  studied.     This  can  be  related  to  the 
fact  that  the  A  horizon  clay  content  had  smaller 
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proportions  of  nugget  variance  (Table  15)  than  those  for 
other  soil  properties  studied  (Tables  14  and  15). 

The  results  of  the  fractal  analysis  were  not  in 
contrast  with  the  results  obtained  with  total  sand  and  clay 
(weighted  average  and  A  horizon)  content  semi-variograms . 
Semi-variograms  indicated  the  presence  of  a  systematic  and 
a  random  variability  within  the  range.     The  D  values 
suggested  that  the  random  component  of  the  variance  was 
large  because  of  the  small  scale  of  the  study.     The  long 
distance  between  observations  could  influence  the  presence 
of  a  large  random  variation.     When  distance  between  pedons 
is  long,  local  variations  in  parent  materials  and/or 
topography  may  increase  the  complexity  in  the  variability 
of  the  soil  properties  studied.     Therefore,  an  increase  in 
scale  and  small  distance  between  sampling  locations 
(pedons)  is  necessary  to  reduce  the  random  variability. 

Despite  the  fact  that  D  values  for  5  km  lag  distance 
were  smaller  than  those  for  10  km  lag  distance,  the  semi- 
variograms  for  5  km  lag  distance  were  only  reliable  for 
short  distances  (1/3  to  1/2  of  the  total  length). 
Therefore,  a  smaller  area  with  greater  pedons  density, 
shorter  distance  between  pedons,  and  with  a  more  uniform 
physiography  was  selected  (Figure  34).     The  reduced  area 
was  located  on  the  Ocala  Uplift  physiographic  region. 

In  general,  the  D  values  were  reduced  (Table  19)  for 
weighted  average  total  sand  and  clay  contents  if  compared 
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Table  19.  Fractal  dimension  (D  value)  derived  from 

selected  soil  property  semi-variograms  for 
a  reduced  study  area. 

Soil  Property  * 

Semi-variogram       Lag  TS         CL       OC       A-CL  A-OC 

distance 

(km)   D  value  


Direction-                 10  1.80  1.60  2.00  1.74  2.00 
independent 

5  1.76  1.68  1.97  1.72  2.00 

E-W                              10  1.87  1.86  2.00  1.75  2.00 

5  1.95  1.66  2.00  1.72  1.96 

NE-SW                            10  1.66  1.70  1.90  1.86  2.00 

5  1.22  1.33  1.90  1.86  2.00 

NW-SE                          10  1.96  1.91  2.00  1.62  2.00 

5  2.00  1.98  2.00  1.78  2.00 

N-S                              10  1.76  1.85  1.59  1.74  2.00 

5  1.68  1.79  1.59  1.40  2.00 


*  See  Abbreviations,  pp.  xii-xiii 


to  the  D  values  for  the  entire  studied  area  (Table  18). 
Some  of  the  D  values  for  organic  carbon  content  decreased. 
The  D  values  for  the  A  horizon  clay  content  remained 
approximately  the  same  or  increased.     These  results 
indicate  that  when  the  scale  of  study  is  increased,  soil 
properties  with  a  large  proportion  of  nugget  variance 
(e.g.,  total  sand)  had  a  larger  decrease  in  the  random 
variation  than  soil  properties  with  small  proportion  of 
nugget  variance  (e.g.,  the  A  horizon  clay  content). 

Some  D  values  for  weighted  average  total  sand  and  clay 
contents  and  A  horizon  clay  content  increased  for  5  km  lag 
distances  indicating  a  complex  variation  within  this 
distance.     Thus,  if  random  variation  is  to  be  reduced,  the 
distance  between  sampling  locations  has  to  be  smaller  than 
5  km. 

The  large  D  values  ( larger  than  1.5)  have  been  found 
to  be  common  in  soils  (Burrough,  1983b).     For  future 
planning  of  small  scale  studies  in  the  area,  it  is 
necessary  to  consider  that  even  5  km  distance  between 
sampling  locations  give  large  D  values  which  indicates  a 
large  proportion  of  random  variability. 

The  geostatistical  analysis  allowed  the  separation  of 
the  systematic  and  the  random  components  of  the  variance. 
The  fractal  analysis  indicated  that  it  is  necessary  to 
increase  the  scale  of  study  if  the  random  component  is  to 
be  reduced.     Then,  the  variability  of  the  soil  properties 
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studied  could  be  explained  better  as  a  result  of  their 
specific  location  with  respect  to  their  parent  material, 
topography,  vegetation,  and  climate. 


SUMMARY  AND  CONCLUSIONS 
One  hundred  fifty  one  pedons  were  selected  to 
determine  the  important  soil  properties  affecting  the 
spatial  variability  of  soils  in  northwest  Florida.  Each 
pedon  was  located  by  a  system  of  geographic  coordinates  (X 
and  Y) . 

Twenty  soil  properties  (horizon  thickness;  very 
coarse,  coarse,  medium,  fine,  and  very  fine  sand  fractions; 
total  sand,  silt,  and  clay  contents;  pH-water;  pH-KCl; 
organic  carbon  content;  Ca,  Mg,  Na,  and  K  contents 
extractable  in  NH4OAC;  total  bases;  extractable  acidity; 
cation  exchange  capacity;  and  base  saturation)  were 
initially  selected  for  this  study. 

Principal  component  analysis  (PCA)  and  geostatistics 
were  used  in  addition  to  other  statistical  analyses. 

All  properties  were  non-normally  distributed,  based  on 
the  Kolmogorov  test.     This  result  could  be  influenced  by 
systematic  patterns  of  soil  properties.     Observations  are 
not  independent.     For  example,  the  clay  content  in  the 
argillic  horizon  is  not  independent  of  the  clay  content  in 
the  eluvial  horizon.     Argillic  horizons  developed  because 
clay  is  translocated  from  the  upper  horizons  and  is 
deposited  in  the  lower  horizons.     Argillic  horizons  are 
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developed  under  specific  conditions.     The  process  is  not 
random.     Values  of  soil  properties  were  not  independent  but 
associated,  and  this  fact  could  influence  the  probabilistic 
distribution. 

Individual  soil  properties  have  different  degrees  of 
importance  in  influencing  the  spatial  variability  of  soils. 
In  addition,  geostatistical  analysis  is  time  consuming  and 
complex.     Therefore,  PCA  was  used  as  an  unbiased  method  to 
reduce  the  number  of  soil  properties  initially  selected  for 
study  with  geostatistics . 

Two  sets  of  data  were  used.     One  set  was  composed  of 
weighted  average  of  soil  properties  of  individual  pedons. 
Horizon  thickness  was  used  as  the  weighting  criterion. 
Information  is  lost  when  averages  are  used;  therefore,  a 
second  set  of  data  composed  of  soil  properties  from  the 
surface  A  horizon  were  used.     All  soil  properties  were 
standardized  to  mean  zero  and  variance  one. 

Each  principal  component  (PC)  selected  explained  at 
least  5%  of  the  total  variance  of  each  set  of  data. 
Selection  of  soil  properties  was  based  on  plots  of  soil 
properties  in  the  plane  of  the  first  two  PCs,  orthogonal 
rotation  of  PC's  axes,  guantitative  selection  of  large 
eigenvectors,  analysis  of  collinearity ,  and  correlation 
coefficient  between  soil  properties  and  PC. 
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Weighted  average  total  sand,  clay,  and  organic  carbon 
contents  and  A  horizon  clay  and  organic  carbon  contents 
were  selected  by  the  PCA  for  the  geostatistical  analysis. 

Results  of  the  PCA  were  supported  by  a  nested  analysis 
of  variance.     Soil  properties  selected  by  the  PCA  were 
important  in  explaining  the  variability  within  and  between 
soil  series  and  between  horizons  as  shown  by  the  nested 
analysis  of  variance. 

PCA  and  the  nested  analysis  of  variance  proved  to  be 
useful  statistical  technigues  to  select  important  soil 
properties  to  study  soil  variability.     The  nested  analysis 
of  variance  not  only  validated  the  results  of  the  PCA  but 
also  indicated  that  the  selected  soil  properties  were 
differentiating  properties.     Therefore,  both  analyses  can 
be  used  together  for  a  guantitative  determination  of 
differentiating  soil  properties.     PCA  also  can  be  useful  to 
determine  the  correct  placement  of  pedons  into  the  soil 
classification  system. 

Selected  soil  properties  were  employed  to  study  soil 
variability  using  geostatistical  analysis.  The 
geostatistical  analysis  had  four  parts:  semi-variogram 
calculation,  fitting  of  semi-variograms ,  kriging,  and  use 
of  fractals. 

Direction-independent  and  -dependent  (E-W,  NE-SW,  NW- 
SE,  and  N-S)  semi-variograms  were  calculated  for  each 
selected  soil  property  on  a  380  x  100  km  irregular  grid. 
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Within  series  variance  was  used  as  the  criterion  to  assess 
stationarity  of  soil  property  values.     The  first  calculated 
semi-variogram  (direction-dependent  and  -independent) 
indicated  presence  of  drift  in  the  soil  property  values. 
Drift  was  reduced  by  using  residuals,  but  was  not 
completely  removed.     A  reason  for  this  may  the  presence  of 
a  short-range  or  a  cyclic  variation  in  soil  properties. 

Weighted  average  total  sand  and  clay  contents  and  A 
horizon  clay  content  were  characterized  by  the  presence  of 
structure.     A  nugget  variance  was  also  present.     The  semi- 
variogram  range  varied  from  15  to  35  km. 

Variability  of  soil  properties  was  direction- 
dependent.     Weighted  average  values  had  the  largest 
variability  in  the  N-S  direction.     The  A  horizon  clay 
content  had  the  largest  variability  in  the  NW-SE  direction. 
Differences  between  direction-dependent  semi-variograms 
could  be  the  result  of  differences  in  geology  and 
topography. 

Weighted  average  and  A  horizon  organic  carbon  contents 
had  pure  nugget  effects,     indicating  that  organic  carbon 
contents  had  a  large  point  to  point  variation  at  short 
distances . 

All  observed  semi-variograms  had  a  wave  pattern  that 
indicated  the  presence  of  cyclic  variations  in  the  studied 
soil  properties. 
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Observed  semi-variograms  ( direction- independent  and 
direction-dependent  with  largest  sill)  were  fitted  to 
theoretical  models.     Spherical,  Dewijsian,  Linear,  and  Root 
models  were  selected. 

The  information  derived  from  the  fitted  semi-variogram 
was  used  to  produce  contour  maps  and  diagrams  of  kriged 
soil  properties.     Contour  maps  were  generated  using 
universal  kriging  because  of  the  presence  of  drift  in  the 
data. 

Contour  maps  for  the  direction  with  largest  variation 
did  not  differ  from  those  derived  from  direction- 
independent  semi-variograms.     A  reason  for  this  is  that  the 
geostatistical  program  does  not  take  into  account  the 
direction  of  semi-variograms. 

Contour  maps  of  weighted  average  clay  and  sand 
contents  were  similar.     This  similarity  was  due  to  the  low 
silt  content  and  that  these  two  variables  were  members  of 
the  same  principal  component.     Therefore,  the  use  of 
principal  components  as  variables  in  the  geostatistical 
analysis  can  generate  individual  contour  maps  that  would 
represent  all  soil  properties  included  in  the  principal 
component.     More  investigations  in  the  use  of  principal 
components  as  individual  variables  for  the  geostatistical 
analysis  are  recommended.     It  is  also  recommended  that  such 
results  should  be  compared  with  those  obtained  using  co- 
kriging.     The  use  of  principal  components  as  individual 
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variables  may  have  the  advantage  that  a  single  map  can 
represent  the  variability  of  a  group  of  soil  properties 
which  have  two  characteristics:  first,  they  are 
important  in  explaining  the  total  soil  variability. 
Second,  they  are  differentiating  properties. 

Diagrams  of  kriged  standard  errors  were  also  produced. 
They  indicated  that  soil  properties  with  large  nugget 
variance  had  large  standard  errors.     Likewise,  kriged 
standard  error  diagrams  can  be  very  useful  in  the  design  of 
new  sampling  strategies.     These  diagrams  identified  areas 
that  require  an  increase  in  sampling  intensity  to  improve 
the  precision  of  the  estimates.     The  diagrams  indicated  an 
area  located  in  the  northeastern  part  of  the  Dougherty 
karst  as  the  one  with  the  largest  standard  errors.  A 
possible  reason  for  this  may  be  the  irregular  topography  of 
the  limestone  in  the  area. 

Kriged  standard  error  diagram  and  the  plot  of  pedon 
location  can  be  very  useful  for  planning  future  sampling 
strategies.     Standard  error  diagrams  indicate  areas  that 
require  an  increase  in  sampling.     A  plot  of  pedon  location 
indicates  specific  places  within  the  areas  with  large 
standard  error  where  additional  samples  need  to  be  taken. 

Results  of  the  geostatistical  analysis  supported  the 
fact  that  values  of  soil  properties  are  not  independent. 
Total  sand  and  clay  content  semi-variograms  had  ranges  that 
varied  from  10  to  35  km.    Values  of  studied  soil  properties 
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were  related  (i.e.,  they  were  not  independent)  within  the 
range  distances.     Organic  carbon  content  semi-variograms 
did  not  have  any  range,  but  had  a  characteristic  wave 
pattern  that  also  indicated  a  degree  of  dependence  among 
the  values  of  organic  carbon  content. 

Finally,  the  fractal  dimension  was  derived  from  the 
semi-variograms.     In  general,  the  fractal  dimension  was 
large  ( larger  than  1.5).     These  large  values  indicated  that 
the  scale  of  the  study  needs  to  be  increased.     A  fractal 
dimension  was  also  calculated  for  a  reduced  area.  The 
reduced  area  had  a  larger  pedon  density,  and  therefore 
shorter  distances  between  pedons  than  those  for  the  area 
initially  studied.     The  fractal  dimension  of  soil 
properties  was  reduced,  indicating  the  scale-dependent 
character  of  soil  variability. 

Results  of  the  fractal  analysis  were  as  expected 
because  the  studied  area  is  a  large  area  with  a  large 
variation  in  geology  and  topography.     Ranges  obtained  from 
semi-variograms  can  be  used  to  determine  the  grid  size 
necessary  to  study  the  spatial  variability  of  new  areas  in 
northwest  Florida. 

The  guantif ication  of  soil  variability  has  two 
aspects.     First,  the  conditions  in  which  statistical 
analyses  are  used.     Second,     the  statistical  analyses  that 
are  employed. 
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Soils  in  their  natural  environment  do  not  follow 
randomized  block  or  latin  square  patterns  but  soil 
scientists  have  used  statistical  experimental  designs  to 
study  specific  soil  properties  in  laboratory  or  in 
greenhouse  experiments.    The  greenhouse  experiments  have 
been  performed  to  study  how  management  (fertilizer,  liming, 
or  irrigation)  influences  the  soil  properties  of  interest. 
Results  of  greenhouse  experiments  have  been  validated  by 
specific  experimental  design  under  field  conditions. 

For  soil  scientists  involved  in  the  study  of 
pedogenesis,  soil  variability,  or  soil  geography,  is  very 
difficult  to  apply  the  same  statistical  experimental 
designs  or  to  apply  similar  statistical  analyses  because  of 
the  difficulty  of  fulfilling  the  required  assumptions. 
Soil  properties  that  are  non-normally  distributed  cannot  be 
forced  to  normality.     Dependent  values  of  soil  properties 
cannot  be  forced  to  be  independent.     A  biased  sampling  of 
typical  pedons  cannot  be  forced  to  be  a  random  sampling 
procedure.     But  controlled  condition  are  required  to 
guarantee  some  standard  conditions  for  quantitative 
analysis  and  for  further  application  in  field  conditions. 

The  opinion  of  this  author  is  that  soils  have  two 
scales  of  variability.     One,  the  large  scale  (vertical 
direction)  limited  by  the  root  system  of  crops,  grasses, 
trees,  or  specific  engineering  uses  (e.g.,  septic  tanks). 
The  other,  the  small  scale  (horizontal  direction)  is 
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limited  by  the  extension  of  the  study  or  the  land  use. 
Soil  scientists  have  been  relatively  successful  in  studying 
the  vertical  variability  because  they  have  been  capable  of 
recognizing  soil  horizons.     Soil  properties  are  then 
related  to  specific  horizons.     But  soil  scientists  have 
been  less  sucessful  in  studying  the  variability  in  the 
horizontal  direction  because  of  the  lack  of  emphasis  in  the 
geographic  aspect  of  soils. 

Polypedons  have  a  geographic  connotation.  More 
emphasis  has  been  placed  on  their  morphologic  description 
than  on  their  geographic  relations.     Therefore,  the  use  of 
polypedons  as  geographic  entities  is  limited. 

External  environmental  features  (e.g.,  landform, 
vegetation)  which  are  consistently  recognized  and  mappable 
have  been  helpful  in  delineating  soils.     Therefore,  this 
author  believes  that  landscape  position  can  be  used  as  a 
geographic  entity  to  study  soil  variability.  Landscape 
position  takes  into  account  geographic  aspects.  The  close 
relationship  between  soil  and  landscape  position  has  been 
discussed  by  many  pedologists.     Landscape  position  can 
represent  the  "greenhouse"  to  test  quantitative  methods  to 
study  variability  of  soil  in  natural  conditions. 

The  other  important  aspect  of  the  quantitative 
analysis  of  soil  variability  is  related  to  the  statistical 
analyses  themselves.     Normality,  independence,  and 
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homogeneous  variance  are  basic  assumptions  for  classical 
statistical  analyses. 

This  study  showed  that  the  set  of  data,  selected  for 
this  study,  sampled  in  northwest  Florida  since  1967,  was 
non-normally  distributed,  and  values  of  individual  soil 
properties  were  not  independent.     In  addition,  sampling 
locations  were  not  selected  randomly.     They  were  selected, 
as  often  is  the  case  in  a  soil  survey,  to  be  modal  pedons. 
This  sampling  procedure  contradicts  the  criterion  of 
randomness  important  in  determining  the  degrees  of  freedom 
in  classical  statistics. 

Statistical  analyses  need  to  be  divided  into  two 
groups:   (i)  those  methods  that  can  be  used  to  analyze  data 
in  an  artificial  context  (i.e,  laboratory  or  greenhouse), 
and  (ii)  those  techniques  that  can  be  used  to  analyze  data 
derived  from  studies  in  the  natural  environment  (i.e.,  data 
obtained  from  a  soil  survey) . 

In  natural  conditions  it  may  be  difficult  to  satisfy 
the  assumptions  required  for  classical  statistical 
analyses.     Testing  the  assumptions  is  recommended; 
otherwise  erroneous  conclusions  can  be  stated.  When 
assumptions  are  not  achieved  it  is  necessary  to  study  how 
this  fact  affects  the  results  of  the  analysis,  when 
assumptions  are  not  achieved,  it  is  necessary  to  employ 
alternative  analyses  (i.e,  nonparametric  analysis),  or  not 
to  use  inferential  statistics. 
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For  example,  in  this  study  a  classical  statistical 
technique,  nested  analysis  of  variance,  was  employed  to 
support  results  of  the  soil  survey.     But  the  hypothesis 
testing  was  not  performed;  thus,  assumptions  of  this 
analysis  were  not  required. 

The  assumption  of  PCA  is  related  with  homogeneity  of 
the  variances.     This  assumption  was  fulfilled  by 
standardizing  the  data.     Results  of  the  PCA  were  validated 
by  the  results  of  the  nested  analysis  of  variance  on  the 
raw  data. 

The  geostatistical  analysis  has  the  assumption  of 
stationarity.     This  assumption  was  not  fulfilled.  Thus, 
universal  kriging,  which  takes  into  account  the  presence  of 
drift,  was  used  as  an  interpolation  method. 

Statistical  analyses  are  needed  to  support  and  to 
improve  the  results  of  soil  surveys.     The  nested  analysis 
of  variance  is  a  technique  that  can  be  used  very  easily  to 
determine  the  extent  of  the  within-map  unit  variance.  PCA 
is  useful  to  select  the  important  soil  properties  to  study 
the  soil  variability.     PCA  reduced  the  number  of  variables 
selected  for  additional  study.     But  a  hiatus  remains 
between  the  results  of  the  PCA  and  the  results  of  the  soil 
survey.     Soil  scientists  classified  several  pedons  to  a 
specific  soil  series.     But  the  PCA  indicated  a  large  degree 
of  dispersion  within  the  soil  series.     This  author  believes 
that  this  result  is  mainly  due  to  the  lack  of  emphasis  of 
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soil  and  landscape  relationships,  and  to  the  fact  that 
morphological  soil  properties  were  not  considered  in  the 
PCA.    Therefore,  more  investigations  are  recommended  in  the 
use  of  morphological  properties,  not 

only  in  PCA  but  also  in  common  statistical  techniques  used 
in  quantitative  analysis. 

The  other  statistical  analysis  used,  geostatistics , 
has  two  advantages  over  classical  statistical  techniques. 
First,  geostatistics  takes  into  account  the  location  of  the 
observations.     Second,  geostatistics  not  only  considers  the 
observation  values  but  also  their  geometric  support  (i.e., 
soil  as  a  volume).     Results  of  the  geostatistical  analysis 
need  to  be  validated  by  comparison  of  the  isarithmic  map 
with  the  soil  survey  map.     The  use  of  individual  PCs  to 
obtain  the  isarithmic  map  is  one  way  to  do  so.  More 
investigations  are  recommended. 


APPENDIX  A 


CLASSIFICATION  OF 
SERIES  STUDIED 


Soil  Series         Taxonomic  classification 

(Soil  temperature  is  thermic)* 


Alaga 
Albany 
Angie 
Apalachee 

Ardilla 

Bethera 
Blanton 
Bonif ay 

Bonneau 

Cantey 

Chipley 

Chipola 

Compass 

Cowarts 

Coxville 

Dothan 

Duplin 

Esto 

Escambia 

Faceville 
Fuquay 

Garcon 

Greenville 

Goldsboro 

Hornsville 

Iuka 

Kenansville 
Kinston 

Lakeland 
Leef ield 

Lucy 
Lyerly 

Lynchburg 

Malbis 

Mulat 

Nankin 

Norfolk 

Ocilla 

Oktibbeha 


Coated  Typic  Quartz ipsamments . 

Loamy,  siliceous  Grossarenic  Paleudults. 

Clayey,  mixed  Aquic  Paleudults. 

Very  fine,  montmorillonitic ,  Fluvaquentic 

Dystrochrepts . 

Fine-loamy,  siliceous  Fragiaquic 
Paleudults . 

Clayey,  mixed  Typic  Paleaguults. 
Loamy,  siliceous  Grossarenic  Paleudults. 
Loamy,  siliceous  Grossarenic  Plinthic 
Paleudults . 

Loamy,  siliceous  Arenic  Paleudults. 
Clayey,  kaolinitic  Typic  Albaquults. 
Coated  Typic  Quartz ipsamments . 
Loamy,  siliceous  Arenic  Hapludults. 
Coarse-loamy,  siliceous  Plinthic 
Paleudults . 

Fine-loamy,  siliceous  Typic  Hapludults. 
Clayey,  kaolinitic  Typic  Paleaquults. 
Fine-loamy,  siliceous  Plinthic  Paleudults. 
Clayey,  kaolinitic  Aguic  Paleudults. 
Clayey,  kaolinitic  Typic  Paleudults. 
Coarse-loamy,  siliceous  Plinthaguic 
Paleudults. 

Clayey,  kaolinitic  Typic  Paleudults. 
Loamy,  siliceous  Arenic  Plinthic 
Paleudults. 

Loamy,  siliceous  Arenic  Hapludults. 
Clayey,  kaolinitic  Rhodic  Paleudults. 
Fine-loamy,  siliceous  Aquic  Paleudults. 
Clayey,  kaolinitic  Typic  Hapludults. 
Coarse-loamy,  siliceous,  acid  Aguic 
Udif luvents . 

Loamy,  siliceous  Arenic  Hapludults. 
Fine-loamy,  siliceous,  acid  Typic 
Fluvaguents . 

Uncoated  Typic  Quartz ipsamments. 
Loamy,  siliceous  Arenic  Plinthaguic 
Paleudults. 

Loamy,  siliceous  Arenic  Paleudults. 
Very-fine,  montmorillonitic  Vertic 
Hapludalf s . 

Fine-loamy,  siliceous  Aerie  Paleaguults. 
Fine-loamy,  siliceous  Plinthic  Paleudults. 
Coarse-loamy,  siliceous  Typic  Ochraguults. 
Clayey,  kaolinitic  Typic  Hapludults. 
Fine-loamy,  siliceous  Typic  Paleudults. 
Loamy,  siliceous  Aguic  Arenic  Paleudults. 
Very-fine,  montmorillonitic  Vertic 
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Orangeburg 

Pansey 

Pantego 

Pelham 

Plummer 

Rains 

Redbay 

Rutlege 

Sapelo 

Shubuta 

Stilson 

Surrency 
Tifton 

Troup 
Wagram 
Yemasse 
Yonges 


Hapludalf s. 

Fine-loamy,  siliceous  Typic  Paleudults. 
Fine-loamy,  siliceous  Plinthic  Paleaquult 
Fine-loamy , siliceous  Umbric  Paleaquults. 
Loamy,  siliceous  Arenic  Paleaquults. 
Loamy,  siliceous  Grossarenic  Paleaquults. 
Fine-loamy,  siliceous  Typic  Paleaquults. 
Fine-loamy,  siliceous  Rhodic  Paleudults. 
Sandy,  siliceous  Typic  humaquepts. 
Sandy,  siliceous  Ultic  Haplaquods. 
Clayey,  mixed  Typic  Paleudults. 
Loamy,  siliceous  Arenic  Plinthic 
Paleudults. 

Loamy,  siliceous  Arenic  Umbric  Paleaquult 
Coarse-loamy,  siliceous  Plinthic 
Paleudults . 

Loamy,  siliceous  Grossarenic  Paleudults. 
Loamy,  siliceous,  Arenic  Paleudults. 
Fine- loamy,  mixed  Aerie  Ochraquults. 
Fine-loamy,  mixed  Typic  Ochraqualfs. 


*  Source:  Calhoun  et  al.,  1974;  Carlisle  et  al.,  1978, 
1981,  1985;  I.F.A.S.  Soil  Characterization 
Laboratory,  unpublished  data. 


APPENDIX  B 

GEOGRAPHIC  COORDINATES  OF  PEDONS  STUDIED 


Pedon  Laboratory 
number  number 


X  * 
Coordinate 
(4.8  km/X) 


Y  * 
Coordinate 
(4.6  km/Y) 


1 

153-159 

42.50 

18.75 

2 

140-146 

43.10 

17.40 

3 

133-139 

43.80 

21.85 

4 

147-152 

42.00 

19.00 

5 

160-166 

40.50 

13.10 

6 

167-172 

3.60 

17.15 

7 

101-106 

5.25 

16.60 

8 

107-111 

3.40 

16.70 

9 

397-401 

5.80 

17.45 

10 

409-414 

3.90 

17.45 

11 

402-408 

29.90 

19.15 

12 

83-89 

34.20 

19.20 

13 

90-95 

9.40 

21.55 

14 

389-396 

10.10 

21.60 

15 

96-99 

33.70 

19.20 

16 

379-383 

43.80 

15.75 

17 

384-388 

7.50 

17.85 

18 

327-332 

43.50 

14.50 

19 

271-277 

32.90 

17.30 

20 

295-301 

42.55 

16.45 

21 

1072-1080 

42.20 

17.55 

22 

322-326 

43.35 

17.60 

23 

333-337 

7.65 

21.50 

24 

338-342 

7.25 

21.60 

25 

278-281 

7.50 

21.60 

26 

282-284 

9.90 

21.60 

27 

285-287 

7.40 

21.60 

28 

288-291 

41.90 

16.50 

29 

292-294 

5.20 

16.80 

30 

1065-1071 

40.50 

13.85 

31 

316-321 

41.40 

16.30 

32 

1056-1064 

43.30 

18.30 

33 

302-307 

18.20 

43.00 

34 

1895-1899 

64.25 

9.85 

35 

1475-1479 

59.25 

11.55 

36 

1427-1433 

46.80 

20.15 

37 

1446-1453 

64.80 

10.95 

38 

1454-1460 

37.50 

20.70 

39 

1469-1474 

37.25 

20.50 

40 

1887-1894 

42.60 

17.40 

41 

1880-1886 

37.90 

20.35 

42 

1872-1879 

62.65 

8.90 

43 

1434-1441 

59.00 

11.40 

44 

1900-1905 

44.50 

0.30 

45 

1461-1468 

59.30 

10.85 

46 

1480-1484 

37.80 

20.10 

181 


182 


47  2049-2055 

48  1375-1379 

49  1575-1581 

50  628-633 

51  2056-2064 

52  2065-2071 

53  1561-1567 

54  1609-1614 

55  1582-1588 

56  1597-1602 

57  634-637 

58  1906-1910 

59  1603-1608 

60  2262-2268 

61  2823-2827 

62  2382-2387 

63  2299-2306 

64  2842-2847 

65  3283-3288 

66  2360-2364 

67  2365-2369 

68  2377-2381 

69  2354-2359 

70  3322-3328 

71  2866-2874 

72  2370-2376 

73  2835-2841 

74  3268-3273 

75  2291-2298 

76  2857-2865 

77  2347-2353 

78  2875-2880 

79  2276-2283 

80  2617-2623 

81  2307-2312 

82  2828-2834 

83  2624-2630 

84  2284-2290 

85  2637-2644 

86  2810-2817 

87  2388-2392 

88  2631-2636 

89  2645-2652 

90  2653-2658 

91  3274-3282 

92  3417-3425 

93  4303-4309 

94  4064-4069 

95  4070-4077 

96  4276-4283 

97  4056-4063 


58  .90 

11.  50 

65 . 00 

9.30 

21 . 15 

21.  35 

22.40 

21.20 

20 . 90 

21.  25 

23 . 10 

20 . 00 

23 . 95 

18 . 40 

22.00 

21.10 

23  .15 

20.85 

65. 10 

10.15 

64 . 90 

9 . 55 

68.60 

10.95 

68 . 85 

12.70 

66 . 90 

10.  40 

25 . 25 

18 . 00 

23  .55 

20 . 70 

66 . 50 

12.80 

65.20 

8  .70 

38 . 10 

7.00 

65.35 

10.15 

68 . 50 

11 .  95 

67.75 

8.75 

65 . 30 

9.45 

68.70 

13 . 50 

64 . 90 

10.15 

79 . 70 

11.55 

75 . 30 

10.85 

67  .30 

13.35 

67.  95 

9.70 

66 . 65 

11.45 

72 . 45 

13 . 40 

65.70 

4 . 60 

43  .95 

21 .80 

40 . 40 

18.15 

40.45 

11.30 

43.70 

15.90 

40. 10 

11.25 

6 . 40 

15.00 

40.05 

19.70 

40.80 

15.60 

43.95 

15.85 

9.35 

17.45 

3.00 

17.20 

5.80 

14.85 

10 . 95 

18.10 

11.95 

14.25 

3.10 

16.40 

3.40 

14.00 

5.55 

16.00 

5.45 

16.80 

34.60 

15.50 

98 

3395 

-3401 

99 

4086 

-4092 

100 

3388 

-3394 

101 

4101 

-4107 

102 

4270 

-4275 

103 

3410 

-3416 

104 

4284 

-4291 

105 

4108 

-4114 

106 

4078 

-4085 

107 

4263 

-4269 

108 

4298 

-4302 

109 

4050 

-4055 

110 

4619 

-4623 

111 

4756 

-4763 

112 

4847 

-4853 

113 

4478 

-4486 

114 

4788 

-4795 

115 

4874 

-4881 

116 

4510 

-4516 

117 

5214 

-5217 

118 

4796 

-4803 

119 

4504 

-4509 

120 

4991 

-4997 

121 

4517 

-4522 

122 

4750 

-4755 

123 

4764 

-4773 

124 

4499 

-4503 

125 

5131 

-5138 

126 

4493' 

-4498 

127 

5139' 

-5147 

128 

5148' 

-5157 

129 

4898' 

-4903 

130 

5027' 

-5032 

131 

4470' 

-4477 

132 

5062' 

-5066 

133 

4804' 

-4810 

134 

4836- 

-4839 

135 

5021- 

-5026 

136 

4632- 

-4639 

137 

4998- 

-5003 

138 

5207- 

-5213 

139 

4488- 

-4492 

140 

4892- 

-4897 

141 

5726- 

-5731 

142 

5732- 

-5738 

143 

5714- 

-5719 

144 

5511- 

-5518 

145 

5720- 

-5725 

146 

5493- 

-5498 

147 

5499- 

-5504 

148 

5505- 

■5510 
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28.00 

15.25 

7.20 

17.70 

6.60 

20.60 

5.10 

16.50 

5.15 

14.10 

42.10 

18.40 

11.45 

21.55 

49.15 

14.75 

64.70 

10.00 

65.15 

10.55 

41.75 

18.55 

58.70 

11.65 

56.60 

8.50 

62.60 

12.20 

46.60 

19.80 

46.95 

20.10 

65.00 

11.35 

7.35 

12.60 

47.00 

19.85 

65.10 

12.70 

7.65 

19.95 

60.90 

10.90 

60.80 

11.40 

59.40 

8.40 

60.60 

10.50 

24.00 

14.55 

21.00 

15.70 

22.05 

21.35 

29.80 

14.70 

23.15 

16.10 

60.60 

13.20 

60.75 

10.80 

22.30 

21.35 

63.70 

12.55 

26.60 

19.20 

36.95 

4.90 

20.25 

20.90 

65.60 

7.10 

27.00 

15.80 

38.50 

8.60 

23.90 

14.70 

67.55 

12.50 

25.15 

20.60 

28.80 

10.20 

29.00 

9.60 

67.40 

9.20 

20.20 

14.60 

68.30 

12.15 

38.75 

4.00 

38.45 

11.55 

78.10 

10.00 

184 

149  5396-5402  77.80  10.70 

150  5524-5528  27.65  14.45 

151  5403-5410  26.55  14.00 


*  Coordinates  are  relative  to  a  point  of  origin  30s  00'   00''  N 
and  87s  24'   18''  W  (X  =  0  and  Y  =  0)  chosen  to  ensure  that 
all  coordinates  would  be  positive. 


APPENDIX  C 


SEMI -VARIOGRAMS  FOR  DIRECTIONS 
WITH  LARGEST  VARIABILITY 
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APPENDIX  D 


CONTOUR  MAPS  FOR  DIRECTIONS 
WITH  LARGEST  VARIABILITY 
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APPENDIX  E 


MAP  OF  PHYSIOGRAPHIC  REGIONS 
IN  NORTHWEST  FLORIDA 
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